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(57) Abstract: The present invention discloses polykeudes and the polyketide synthases and ancillary enzymes that are capable of 
producing such compounds. More particularly, the present invention discloses polynucleotides and polypeptides associated with 
(i) a novel polyketide synthase linked to a non-ribosomal peptide synthetase involved in the biosynthesis of albicidins, (ii) a novel 
phosphopantetheinyl transferase for activating enzymes, particularly polyketide synthases and/or non-ribosomal peptide synthetases, 
associated with the biosynthesis of albicidins, and (iii) a novel methyltransferase for methylating precursors of albicidins and/or inter- 
mediates related to albicidin biosynthesis. The present invention also discloses methods of using the aforementioned polynucleotides 
and polypeptides for activating polyketide synthases and/or non-ribosomal peptide synthetases, for methylating precursors of albi- 
cidins or their analogues and/or intermediates involved in the biosynthesis of albicidins or analogues thereof and for enhancing the 
level and/or functional activity of albicidins or their analogues. Also disclosed are methods of using the polynucleotides and polypep- 
tides of the invention for the biosynthesis of albicidins or their analogues. 
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POLYNUCLEOTIDES AND POLYPEPTIDES ASSOCIATED WITH 
ANTIBIOTIC BIOSYNTHESIS AND USES THEREFOR 

FIELD OF THE INVENTION 

THIS INVENTION relates generally to antibiotic biosynthesis. More particularly, 

5 the present invention relates to polyketides and the polyketide synthases and ancillary 
.-enzymes that are capable of producing such compounds. Even more particularly, the 
present invention relates to a polyketide synthase linked to a non-ribosomal peptide 
synthetase involved in the biosynthesis of albicidins, to a phosphopantetheinyl transferase 
for activating enzymes, particularly polyketide synthases and/or non-ribosomal peptide 

10 synthetases, associated with the biosynthesis of albicidins, and to a methy {transferase for 
methylating precursors of albicidins and/or intermediates related to albicidin biosynthesis. 
The present invention also relates to biologically active fragments of the aforementioned 
polypeptides and to variants and derivatives of these molecules. Further, the invention 
relates to polynucleotides encoding the said polypeptides, including the xabA, xabB and 

15 xabC genes of Xanthomonas albilineans, to polynucleotides encoding the said fragments, 
variants or derivatives, to vectors comprising the said polynucleotides and to host cells 
containing such vectors. The invention also relates to a transcriptional control element for 
modulating the expression of polynucleotides including, for example, the xabB gene and/or 
the xabC gene of Xanthomonas albilineans, or variants thereof. The invention also features 

20 methods of using the polynucleotides, polypeptides, fragments, variants, derivatives and 
vectors for activating polyketide synthases and/or non-ribosomal peptide synthetases, for 
methylating precursors of albicidins or their analogues and/or intermediates involved in the 
biosynthesis of albicidins or their analogues and for enhancing the level and/or functional 
activity of albicidins or their analogues. The invention also encompasses methods of using 

25 the aforesaid polynucleotides, polypeptides, fragments, variants and derivatives for the 
biosynthesis of albicidins or analogues thereof. 

Bibliographic details of various publications referred to by author in this 
specification are collected at the end of the description. 
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BACKGROUND OF THE INVENTION 

Polyketides represent a large structurally diverse group of compounds synthesised 
from 2-carbon units through a series of condensations and subsequent modifications. They 
possess a broad range of biological activities including antibiotic and pharmacological 
5 properties. For example, polyketides are represented by antibiotics such as tetracyclines, 
erythromycins, immunosuppressants such as FK506, FK520 and rapamycin, anticancer 
agents such as daunomycin and veterinary products such as monensin and avermectin. 

Considering the difficulty in producing polyketide compounds by conventional 
chemical methodologies, and the typically low production of polyketides in wild-type 

10 cells, there has been considerable interest in finding improved or alternate means to 
produce polyketide compounds. In this regard, reference may be made to PCT publication 
Nos. WO 93/13663; WO 95/08548; WO 96/40968; WO 97/02358; and WO 98/27203; 
U.S. Pat. Nos. 4,874,748; 5,063,155; 5,098,837; 5,149,639; 5,672,491; and 5,712,146; Fu 
et al (1994, Biochemistry 33: 9321-9326); McDaniel et al (1993, Science 262: 1546- 

15 1550); and Rohr (1995, Angew. Chem. Int. Ed. Engl 34(8): 881-888). 

Polyketides are synthesised in nature by polyketide synthases (PKS). These 
enzymes, wh?ch are actually complexes of multiple ensyme activities, are m some ways 
similar to, but in other ways different from, the synthases that catalyse condensation of 2- 
caibon units in the biosynthesis of fatty acids. Specifically, PKS enzymes catalyse the 

20 biosynthesis of polyketides through repeated (decarboxylative) Claisen condensations 
between acylthioesters (e.g., acetyl, propionyl, malonyl or methylmalonyl). Following each 
condensation, they introduce structural variability into the product by catalysing all, part, 
or none of a reductive cycle comprising a ketoreduction, dehydration, and enoylreduction 
on the j5-keto group of the growing polyketide chain. PKS enzymes incorporate enormous 

25 structural diversity into their products, in addition to varying the condensation cycle, by 
controlling choice of primer, extender units, and the overall chain length and, particularly 
in the case of aromatic polyketides, regiospecific cyclisation of the nascent polyketide 
chain. After the carbon chain has grown to a length characteristic of each specific product, 
it is released from the synthase by thiolysis or acyltransfer. Thus, the PKS complexes 

30 consist of families of enzymes which work together to produce a given polyketide. It is the 
choice of chain-building units, controlled variation in chain length, and die reductive cycle, 
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genetically programmed into each PKS, that contributes to the variation seen among 
naturally occurring polyketides. 

Two major types of PKS enzymes are known; these differ in their composition 
and mode of synthesis of the polyketide synthesised. These two major types of PKS 
5 enzymes are commonly referred to as Type I or "modular" and Type II "iterative" PKS 
enzymes. These classifications are well known and reference may be made, for example, to 
Hopwood and Khosla (1992). 

The Type I or modular PKS enzymes typically catalyse the biosynthesis of 
complex polyketides such as erythromycin and avermectin. These modular enzymes 

10 include assemblies of several large multifunctional proteins carrying, between them, a set 
of separate active sites for each step of carbon chain assembly and modification (Cortes et 
al. f 1990; Donadio et al. 9 1991; MacNeil et aL, 1992). Accordingly, modular PKS 
complexes can be viewed as biochemical assembly lines, composed of a series of catalytic 
domains involved in sequential assembly and modification of acyl groups on the growing 

15 polyketide chain (Cane et al y 1998; Keating and Walsh, 1999). The catalytic domains are 
arranged in "modules", punctuated by acyl carrier protein (ACP) domains that tether the 
nascent polyketide while it undergoes the catalytic modifications programmed in the 
associated module. For each polyketide there is an initiation module, a series of elongation 
modules that define the length and structure of the polyketide chain, and a termination 

20 module to release the product from the final tether. The initiation module typically 
comprises an acyl transferase (AT) domain that couples the initial acyl group from an acyl- 
Co A substrate to the phosphopantetheinyl tether of the first ACP domain. Each elongation 
module typically comprises a ketosynthase (KS), an AT and an ACP. The KS removes the 
growing polyketide unit from the upstream ACP and couples it to the next acyl group in 

25 the chain, which has already been selected and loaded by the AT onto the ACP in the same 
module. Other catalytic domains (eg. a ketoacyl reductase (KR), and dehydratase (DH)) 
within an elongation module can modify the newly elongated polyketide before it is 
transferred to the next module in the biochemical assembly line. A thioesterase (TE) 
domain in the termination module accomplishes release of the assembled polyketide from 

30 the last ACP in the series (Cane et al y 1998; Keating and Walsh, 1999). 
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Biosynthesis of a polyketide can involve the sequential action of several PKS 
proteins, each with one to six elongation modules (MacNeil et al, 1992; Apricio et al., 
1996). There are variations on the modular PKS design, including participation by some 
loading domains across modules or in trans from separate proteins (Keating and Walsh, 

5 1999), and several examples of hybrid PKS/NRPS proteins (Albertini et al, 1995; Gehring 
et al., 1998; Duitman et al., 1999; Paitan et al, 1999). Subsequent modification of the 
polyketide by dedicated tailoring enzymes is generally required to complete the 
biologically active product (Hopwood, 1997). Other biologically active compounds 
including antibiotics comprise polypeptides assembled by non-ribosomal peptide 

10 synthetases (NRPSs). NRPSs typically show a modular architecture and tethered 
biosynthetic strategy analogous to PKSs (Cane et al, 1998; Keating and Walsh, 1999). In 
NRPSs a condensation (C) domain removes the growing peptide unit from the upstream 
PCP domain and couples it to the next amino acid group in the chain, which has already 
been selected and loaded by an adenylation (A) domain onto the PCP in the same module 

15 (Marahiel et al., 1997; Stachelhaus et al, 1998). Other catalytic domains (e.g., epimerase 
or N-methytransferase) within an elongation module can modify the newly elongated 
polypeptide before it is transferred to the next module in the biochemical assembly line 
(Marahiel et al., 1997). 

Many phytopathogenic bacteria and fungi secrete toxins with phytotoxic activity 
20 and a broad spectrum of antimicrobial properties (Guenzi et al, 1998). Albicidin 
phytotoxins are polyketides produced by Xanthomonas albilineans, which are key 
pathogenicity factors in the development of leaf scald, one of the most devastating diseases of 
sugarcane (Saceharum, interspecific hybrids) (Ricaud and Ryan, 1989; Zhang and Birch, 
1997; Zhang et al., 1999). Albicidins selectively block prokaryote DNA replication and cause 
25 the characteristic chlorotic symptoms of leaf scald disease by blocking chloroplast 
development (Birch and Paul, 1983; 1985b; 1987a; 1987b). Because albicidins are rapidly 
bactericidal at nanomolar concentrations against a broad range of Gram-positive and 
Gram-negative bacteria, they are also of interest as potential clinical antibiotics (Birch and 
Patil, 1985a). 

30 The major antimicrobial component of the family of albicidins produced in 

culture by X. albilineans has been partially characterised as a low M r compound with 
several aromatic rings (Birch and Patil, 1985a). Low yields have slowed studies into the 
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chemical structure of albicidin, its application as a tool to study prokaryote DNA 
replication, and its development as a clinical antibiotic (Zhang et al, 1998). Genetic 
analysis of albicidin biosynthesis is likely to indicate approaches to increase yields, 
probable structural features, and opportunities for engineering novel antibiotics in this 
5 family. 
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SUMMARY OF THE INVENTION 

The present invention arises in part from the identification and characterisation of = ',\ 

several X albilineans genes associated with albicidin biosynthesis. In particular, the 
present inventor has isolated a novel X albilineans gene (xabB), which encodes a large ; ^ ^ •( 7 

5 protein (predicted Mr 525,695), with a modular architecture indicative of a multifunctional 
PKS linked to a non-ribosomal peptide synthetase (NRPS). At 4801 amino acids in length, 
the product of xabB (XabB) is the largest reported PKS-NRPS. Twelve catalytic domains 
in this multifunctional enzyme are arranged in the order N-terminus-acyl-CoA ligase (AL)- 
acyl carrier protein (ACP)-0-ketoacyl synthase (KS)-/8-ketoacyl reductase (KR)-ACP- 

10 ACP-KS-peptidyl carrier protein (PCP)-condensation domain (Q-adenylation domain (A)- ; ^V, - 
PCP-C. The modular architecture of XabB indicates likely steps 'in albicidin bwsya!^^ 
and approaches to enhance antibiotic yield. The novel pattern of domains, in comparison 
with known PKS-NRPS enzymes for antibiotic production, also contributes to the . , , 

knowledge base for rational design of enzymes producing novel antibiotics. The present 

15 inventor has found that XabB is required for the production of albicidins and that enhanced 
expression of xabB leads to increased levels and/or functional activities of albicidin 
antibiotics. 

A gene (xabQ encoding a novel O-methyltransferase has also been isolated, 
which methylates albicidin precursors and/or intermediates involved in albicidin 
20 biosynthesis. Surprisingly, enhanced expression of xabC has been found to increase the 

levels and/or functional activities of albicidin antibiotics. ; J ' 

... . * 1 1 j* i* , 

The present inventor has also isolated a gene (xabA) encoding a 
phosphopantetheinyl transferase (PPTase), which is required for post-translational 
activation of synthetases in the albicidin biosynthetic pathway. In this regard, it is known 

25 that inefficient phosphopantetheinylation has limited the activity of other antibiotic 
synthetases overexpressed in heterologous species (Walsh et al. y 1997). Accordingly, the 
isolated xabA gene, together with its target in the albicidin biosynthetic pathway (e.g., 
xabB), provide the means to engineer high level co-expression of the albicidin synthetase 
and its activating PPTase to obtain albicidins in higher yields, and ultimately to manipulate 

30 the elements of the albicidin biosynthetic machinery, by mutagenesis or by other means, to 
produce desired structural variants of this novel antibiotic class. 
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The above genes, in whole or in part, together with their variants and derivatives, 
are useful inter alia for modulating the level and/or functional activity of albicidins, for 
expressing PKS enzymes in recombinant host cells, for producing polyketides including 
albicidins and their analogues and for combinatorial biosynthesis, as described hereinafter. 

5 Accordingly, one aspect of the present invention contemplates an isolated 

polypeptide encoding at least a portion of an albicidin PKS-NRPS (XabB) or its variants or 
derivatives. In one embodiment of this type, the invention provides an isolated polypeptide 
comprising at least one domain selected from the group consisting of: 

(a) an acyl-CoA ligase (AL) domain comprising a sequence set forth in any one or 
10 more of SEQ ID NO: 6 and 8, or variants thereof. 

(b) a /3-ketoacyl synthase (KS) domain comprising a sequence set forth in any one or 
more of SEQ ID NO: 10, 12, 14, 16, 18 and 20, or variants thereof, 

(c) a 0-ketoacyl reductase (KR) domain comprising the sequence set forth SEQ ID 
NO: 22, or variants thereof; 

15 (d) an acyl carrier protein (ACP) domain comprising a sequence set forth in any one 

or more of SEQ ID NO: 24, 26 and 28, or variants thereof; 

(e) an adenylation (A) domain comprising a sequence set forth in any one or more of 
SEQ ID NO: 30, 32, 34, 36, 38, 40, 42, 44, 46 ind 48, or variants thereof; 

(f) a peptidyl carrier protein (PCP) domain comprising a sequence set forth in any 
20 one or more of SEQ ID NO: 50 and 52, or variants thereof; and 

(g) a condensation (Q domain comprising a sequence set forth in any one or more of 
SEQ ID NO: 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78 and 80, or variants 
thereof. 

Preferably, the AL domain comprises each of the sequences set forth in SEQ ID 
25 NO: 6 and 8, or variants thereof. 

In one embodiment, the KS domain preferably comprises each of the sequences 
set forth in SEQ ID NO: 10, 12 and 14, or variants thereof. In an alternate embodiment, the 
KS domain preferably comprises each of the sequences set forth in SEQ ID NO: 16, 18 and 
20, or variants thereof. 

30 Preferably, the A domain comprises each of the sequences set forth in SEQ ID 

NO: 30, 32, 34, 36, 38, 40, 42, 44, 46 and 48, or variants thereof. 
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In one embodiment, the C domain preferably comprises each of the sequences set 
forth in SEQ ID NO: 54, 56, 58, 60, 62, 64 and 66, or variants thereof. In an alternate 
embodiment, the C domain preferably comprises each of the sequences set forth in SEQ ID 
NO: 68, 70, 72, 74, 76, 78 and 80, or variants thereof. 

5 In another embodiment, the invention provides an isolated polypeptide comprising 

at least a biologically active fragment or portion of the sequence set forth in SEQ ID NO: 
2, or a variant or derivative thereof. 

Suitably, the biologically active fragment is at least 6 amino acids in length. 

In a preferred embodiment, the domains broadly described above are arranged in 
10 an N- to C-tenninal direction as follows: AL-ACP-KS-KR-ACP-ACP-KS-PCP-C-A-PCP- 

C. 

Suitably, the biologically active fragment comprises at least one domain selected 
from the group consisting of the AL domain, the KS domain, the KR domain, the ACP 
domain, the A domain, the PCP domain and the C domain as broadly described above. 

15 Suitably, the variant has at least 60%, preferably at least 70%, more preferably at 

least 80%, more preferably at least 90% and still more preferably at least 95% sequence 
identity to the sequence set forth in SEQ ID NO: 2. 

Preferably, the variant comprises at least one sequence selected from the group 
consisting of SEQ ID NO: 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 

20 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78 and 80, or variant 
thereof. In this regard, the variant preferably has at least 70%, preferably at least 80%, 
more preferably at least 90%, and still more preferably at least 95% sequence identity to 
any one of the amino acid sequences set forth in SEQ ID NO: 6, 8, 10, 12, 14, 16, 18, 20, 
22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 

25 70, 72, 74, 76, 78 and 80. 

In another aspect, the present invention contemplates an isolated polypeptide 
encoding at least a portion of aPPTase (XabA) associated with albicidin biosynthesis or its 
variants or derivatives. In one embodiment of this type, the invention provides an isolated 
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polypeptide comprising at least biologically active fragment or portion of the sequence set 
forth in SEQ ID NO: 83, or a variant or derivative thereof. 

Suitably, the biologically active fragment comprises at least one, and preferably 
both, of the consensus PPTase sequence motifs set forth in SEQ ID NO: 89 and 93, or 
5 variant thereof. Preferably, the biologically active fragment comprises the intervening 
sequence between the said consensus PPTase sequence motifs, which intervening sequence 
comprises the sequence set forth in SEQ ID NO: 91, or variant thereof. 

Preferably, the biologically active fragment comprises a contiguous sequence of 
amino acids contained within the sequence set forth in SEQ ID NO: 87, or variant thereof. 

10 Suitably, the variant has at least 60%, preferably at least 70%, more preferably at ; 

least 80%, more preferably at least 90% and still more preferably at least 95% sequence 
identity to the sequence set forth in SEQ ID NO: 83. 

Preferably, the variant comprises at least one sequence selected from the group 
consisting of SEQ ID NO: 87, 89, 91 and 93, or variant thereof. In this regard, the variant 
1 5 preferably has at least 70%, preferably at least 80%, more preferably at least 90%, and still 
more preferably at least 95% sequence identity to any one of the amino acid sequences set 
forth in SEQ ID NO: 87, 89, 91 or 93. 

In yet another aspect, the present invention contemplates an isolated polypeptide 
encoding at least a portion of a methyltransferase (XabC) associated with albicidin 
20 biosynthesis or its variants or derivatives. In one embodiment of this type, the invention 
provides an isolated polypeptide comprising at least biologically active fragment or portion 
of the sequence set forth in SEQ ID NO: 95, or a variant or derivative thereof. 

Suitably, the biologically active fragment comprises at least one, and preferably 
all, of the consensus methyltransferase sequence motifs set forth in SEQ ID NO: 99, 101 
25 and 103, or variant thereof. 

Preferably, the biologically active fragment comprises a contiguous sequence of 
amino acids contained within the sequence set forth in SEQ ID NO: 105, or variant thereof. 
In a preferred embodiment, the biologically active fragment comprises a contiguous 
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sequence of amino acids contained within the sequence set forth in SEQ ID NO: 107, or 
variant thereof. 

Suitably, the variant has at least 60%, preferably at least 70%, more preferably at 
least 80%, more preferably at least 90% and still more preferably at least 95% sequence 
5 identity to the sequence set forth in SEQ ID NO: 95. 

Preferably, the variant has at least 70%, preferably at least 80%, more preferably 
at least 90%, and still more preferably at least 95% sequence identity to any one of the 
amino acid sequences set forth in SEQ ID NO: 99, 101 and 103. 

In still yet another aspect, the invention contemplates an isolated polynucleotide 
10 encoding at least aportion of an albicidin PKS-NRPS (XabB) or its variants or derivatives, 
as broadly described above. Preferably, the polynucleotide comprises the sequence set 
forth in any one of SEQ ID NO: 1 and 3, or a biologically active fragment thereof, or a 
polynucleotide variant of these. 

Suitably, the biologically active fragment is at least 18 nucleotides in length. 

15 The polynucleotide preferably encodes at least one domain selected from the 

group consisting of the AL domain, the KS domain, the KR domain, the ACP domain, the 
A domain, the PCP domain and the C domain as broadly described above. 

Suitably, the AL domain is encoded by a nucleotide sequence set forth in any one 
or more of SEQ ID NO: 5 and 7, or variants thereof. Preferably, the AL domain is encoded 
20 by a nucleotide sequence comprising each of the sequences set forth in SEQ ED NO: 5 and 
7, or variants Ihereof. 

The KS domain is preferably encoded by a nucleotide sequence set forth in any 
one or more of SEQ ID NO: 9, 11, 13, 15, 17 and 19, or variants thereof. In one 
embodiment, the KS domain is preferably encoded by a nucleotide sequence comprising 
25 each of the sequences set forth in SEQ ID NO: 9, 11 and 13, or variants thereof. In an 
alternate embodiment, the KS domain is preferably encoded by a nucleotide sequence 
comprising each of the sequences set forth in SEQ ID NO: 15, 17 and 19, or variants 
thereof. 
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Preferably, the KR domain is encoded by a nucleotide sequence set forth in SEQ 
ID NO: 21, or variant thereof. 

Suitably, the ACP domain is encoded by a nucleotide sequence set forth in any 
one or more of SEQ ID NO: 23, 25 and 27, or variants thereof. 

5 The A domain is preferably encoded by a nucleotide sequence set forth in any one 

or more of SEQ ID NO: 29, 31, 33, 35, 37, 39, 41, 43, 45 and 47, or variants Ihereof. In a 
preferred embodiment, the A domain is encoded by a nucleotide sequence comprising each 
of the sequences set forth in SEQ ID NO: 29, 31, 33, 35, 37, 39, 41, 43, 45 and 47, or 
variants thereof. 

10 Suitably, the PCP domain is encoded by a nucleotide sequence set forth in any 

one or more of SEQ ID NO: 49 and 51, or variants thereof. 

Preferably, the C domain is encoded by a nucleotide sequence set forth in any one 
or more of SEQ ID NO: 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77 and 79, or variants 
thereof. In one embodiment, the C domain is preferably encoded by a nucleotide sequence 
15 comprising each of the sequences set forth in SEQ ID NO: 53, 55, 57, 59, 61, 63 and 65, or 
variants thereof. In. an alternate embodiment, the C domain is preferably , encoded by * 
nucleotide sequence comprising each of the sequences set forth in SEQ ID NO: 67, 69, 71, 
73, 75, 77 and 79, or variants thereof. 

In one embodiment, the polynucleotide variant has at least 60%, preferably at 
20 least 70%, more preferably at least 80%, and still more preferably at least 90% sequence 
identity to any one of the polynucleotides set forth in SEQ ID NO: 1 or 3. 

In another embodiment, the polynucleotide variant is capable of hybridising to 
any one of the polynucleotides identified by SEQ ID NO: 1 or 3 under at least low 
stringency conditions, preferably under at least medium stringency conditions, and more 
25 preferably under high stringency conditions. 

Preferably, the polynucleotide variant comprises a nucleotide sequence encoding 
at least one domain selected from the group consisting of the AL domain, the KS domain, 
the KR domain, the ACP domain, the A domain, the PCP domain and the C domain as 
broadly described above. 
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In one embodiment, the nucleotide sequence variant has at least 60%, preferably 
at least 70%, more preferably at least 80%, and still more preferably at least 90% sequence 
identity to any one of the sequences set forth in SEQ ID NO: 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 
23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 
5 71, 73, 75, 77 and 79. 

In another embodiment, the nucleotide sequence variant is capable of hybridising 

to any one of the sequences identified by SEQ ID NO: 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 

25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 

73, 75, 77 and 79 under at least low stringency conditions, preferably under at least 

1 0 medium stringency conditions, and more preferably under high stringency conditions. 

. • . '. '; ■; [i'-i'M) , ., .>,<.'• '•.'■i ;: /V* 
In a further aspect, the invention contemplates ah isolated polynucleotide 

encoding at least a portion of a PPTase (XabA) associated with albicidin biosynthesis or its 

variants or derivatives, an isolated polynucleotide encoding a polypeptide, fragment, 

variant or derivative as broadly described above. Preferably, the polynucleotide comprises 

15 the sequence set forth in any one of SEQ ID NO: 82 and 84, or a biologically active 

fragment thereof, or a polynucleotide variant of these. 

Alternatively, the polynucleotide comprises a contiguous sequence of nucleotides 
contained within the sequence set forth in SEQ ID NO: 86, or variant thereof. 

In one embodiment, the polynucleotide variant has at least 60%, preferably at 
20 least 70%, more preferably at least 80%, and still more preferably at least 90% sequence 
identity to any one of the polynucleotides set forth in SEQ ID NO: 82, 84 and 86. 

In another embodiment, the polynucleotide variant is capable of hybridising to 
any one of the polynucleotides identified by SEQ ID NO: 82, 84 and 86 under at least low 
stringency conditions, preferably under at least medium stringency conditions, and more 
25 preferably under high stringency conditions. 

Preferably, the polynucleotide variant comprises a nucleotide sequence encoding 
at least one PPTase sequence motif selected from SEQ ID NO: 89 and 93, or variant 
thereof. Suitably, the polynucleotide variant comprises a nucleotide sequence encoding the 
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intervening sequence between the said consensus PPTase sequence motifs, said nucleotide 
sequence comprising the sequence set forth in SEQ ID NO: 91. 

The polynucleotide variant suitably comprises a nucleotide sequence encoding a 
contiguous sequence of amino acids contained within the sequence set forth in SEQ ID 
5 NO: 87, or variant thereof. In this instance, the contiguous sequence is preferably encoded 
by the sequence set forth in SEQ ID NO: 86, or nucleotide sequence variant thereof 

Suitably, the PPTase sequence motif is encoded by a nucleotide sequence 
comprising the sequence set forth in any one of SEQ ID NO: 88 and 92, or nucleotide 
sequence variant thereof. 

10 Preferably, the said intervening sequence is encoded by the nucleotide sequence 

set forth in SEQ ID NO: 90, or nucleotide sequence variant thereof. 

In one embodiment, the nucleotide sequence variant has at least 60%, preferably 
at least 70%, more preferably at least 80%, and still more preferably at least 90% sequence 
identity to any one of the sequences set forth in SEQ ID NO: 86, 88, 90 and 92. 

15 In another embodiment, the nucleotide sequence variant is capable of hybridising 

to any one of the sequences identified by SEQ ID NO: 86, 88, 90 and 92 under at least iow 
stringency conditions, preferably under at least medium stringency conditions, and more 
preferably under high stringency conditions. 

In yet a further aspect, the invention contemplates an isolated polynucleotide 
20 encoding at least a portion of a methyltransferase (XabC) associated with albicidin 
biosynthesis or its variants or derivatives. Preferably, the polynucleotide comprises the 
sequence set forth in any one of SEQ ID NO: 94 and 96, or a biologically active fragment 
thereof, or a polynucleotide variant of these. 

Alternatively the polynucleotide comprises a contiguous sequence of nucleotides 
25 contained within the sequence set forth in SEQ ID NO: 104, or variant thereof. In one 
embodiment, this polynucleotide preferably comprises a contiguous sequence of 
nucleotides contained within the sequence set forth in SEQ ID NO: 106, or variant thereof 
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In one embodiment, the polynucleotide variant has at least 60%, preferably at 
least 70%, more preferably at least 80%, and still more preferably at least 90% sequence 
identity to any one of the polynucleotides set forth in SEQ ID NO: 94, 96, 104 and 106. 

In another embodiment, the polynucleotide variant is capable of hybridising to 
5 any one of the polynucleotides identified by SEQ ID NO: 94, 96, 104 and 106 under at 
least low stringency conditions, preferably under at least medium stringency conditions, 
and more preferably under high stringency conditions. 

Preferably, the polynucleotide variant comprises a nucleotide sequence encoding a 
methyltransferase sequence motif selected from any one or more of SEQ ID NO: 99, 101 
10 and 103, or variant thereof. 

Suitably, the methyltransferase sequence motif is encoded by a nucleotide 
sequence comprising the sequence set forth in any one of SEQ ID NO: 98, 100 and 102, or 
nucleotide sequence variant thereof. 

In one embodiment, the nucleotide sequence variant has at least 60%, preferably 
15 at least 70%, more preferably at least 80%, and still more preferably at least 90% sequence 
identity to any one of the sequences set forth in SEQ ID NO: °8, 1 00 wd.102. 

In another embodiment, the nucleotide sequence variant is capable of hybridising 
to any one of the sequences identified by SEQ ID NO: 98, 100 and 102 under at least low 
stringency conditions, preferably under at least medium stringency conditions, and more 
20 preferably under high stringency conditions. 

In still a further aspect, the invention features an expression vector comprising a 
polynucleotide as broadly described above wherein the polynucleotide is operably linked 
to a regulatory polynucleotide. 

In another aspect, the invention provides a host cell containing a said expression 

25 vector. 

Suitably, the host cell is a bacterium or other prokaryote. 

In yet another aspect, the invention is directed to a multiplicity of cell colonies, 
constituting a library of colonies, wherein each colony of the library contains an expression 
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vector for the production of a polypeptide, fragment, variant or derivative as broadly 
described above. 

The invention also features a method of producing a recombinant polypeptide, 
fragment, variant or derivative as broadly described above, comprising: 
5 - culturing a host cell containing an expression vector as broadly described 

above such that said recombinant polypeptide, fragment, variant or derivative is 
expressed from said polynucleotide; and 

_ isolating the said recombinant polypeptide, fragment, variant or derivative. 

In another aspect, the invention provides a method of producing a biologically 
10 active fragment of a polypeptide as broadly described above, comprising: 

- detecting an activity associated with a fragment of the polypeptide set forth in 
SEQ ID NO: 2, wherein said activity is selected form the group consisting of acyl-CoA 
ligase activity, 0-ketoacyl synthase activity, /3-ketoacyl reductase, acyl carrier protein 
activity, adenylation activity, peptidyl carrier protein activity and condensation activity; 

15 or 

- detecting PPTase activity associated with a fragment of the polypeptide set 
.. forth in SEQ ID NO: 83; or 

- detecting methyltransferase activity associated with a fragment of the 
polypeptide set forth in SEQ ID NO: 95; 

20 wherein detection of said activity is indicative of said fragment being a biologically 

active fragment 

In a further aspect, the invention provides a method of producing a biologically 
active fragment as broadly described above, comprising: 

- introducing a polynucleotide from which a fragment of a polypeptide as 
25 broadly described above can be produced into a cell; and 

- detecting an activity selected form the group consisting of acyl-CoA ligase 
activity, j8-ketoacyl synthase activity, 0-ketoacyl reductase, acyl carrier protein activity, 
adenylation activity, peptidyl carrier protein activity and condensation activity; or 

- detecting PPTase activity associated with a fragment of the polypeptide set 
30 forth in SEQ ID NO: 83; or 
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_ detecting methyltransferase activity associated with a fragment of the 
polypeptide set forth in SEQ ID NO: 95; 

wherein detection of said activity is indicative of said fragment being a biologically 

active fragment 

5 In yet a further aspect, the invention provides a method of producing a variant of a 

polypeptide as broadly described above (parent polypeptide), or a biologically active 
fragment thereof comprising: 

_ producing a modified polypeptide whose sequence is distinguished from the 
parent polypeptide or the biologically active fragment by substitution, deletion or 
10 addition of at least one amino acid; and 

- detecting an activity associated with the modified polypeptide, wherein said 
activity is selected form the group consisting of acyl-CoA ligase activity, 0-ketoacyl 
synthase activity, 0-ketoacyl reductase, acyl carrier protein activity, adenylation 
activity, peptidyl carrier protein activity, condensation activity, PPTase activity and 

15 methyltransferase activity, wherein detection of said activity is indicative of said 
modified polypeptide being a variant. 

T n p further aspect, the.invention contemplates a method of producing a variant of 
a parent polypeptide as broadly described above, or biologically active fragment thereof; 
comprising: 

20 - producing a polynucleotide from which a modified polypeptide as described 

above can be produced; 

- introducing said polynucleotide into a cell; and 

- detecting an activity associated with the modified polypeptide, wherein said 
activity is selected form the group consisting of acyl-CoA ligase activity, 0-ketoacyl 

25 synthase activity, j3-ketoacyl reductase, acyl carrier protein activity, adenylation 
activity, peptidyl carrier protein activity, condensation activity, PPTase activity and 
methyltransferase activity, wherein detection of said activity is indicative of said 
modified polypeptide being a variant. 

In yet another aspect, the invention extends to a method of screening for an agent 
30 that modulates the expression of a gene or variant thereof or the level and/or functional 
activity of an expression product of said gene or variant thereof, wherein said gene is 
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selected fix>m xabB, xabA, or xabC, or a gene belonging to the same regulatory or 
biosynthetic pathway as xabB y xabA, or xabC, said method comprising: 

- contacting a preparation comprising a polypeptide encoded by said gene, or 
biologically active fragment of said polypeptide, or variant or derivative of these, or a 

5 genetic sequence (e.g., a transcriptional control element) that modulates the expression 
of said gene or variant thereof; with a test agent; and 

- detecting a change in the level and/or functional activity of said polypeptide or 
biologically active fragment thereof; or variant or derivative, or of a product expressed 
from said genetic sequence. 

10 The transcriptional control element preferably comprises the sequence set forth in 

SEQ ID NO: 81 or complement thereof. 

The invention, in another aspect, also provides a method for enhancing the level 
and/or functional activity of an albicidin, said method comprising: 

_ introducing into an albicidin-producing host cell (1) an agent that modulates 
15 the expression of a gene encoding at least a portion of an albicidin PKS-NRPS or 

variant or derivative thereof, or the level and/or functional activity of an expression 
product of said gene, or (2) a vector from which a polynucleotide encoding at least a 
portion of an albicidin PKS-NRPS or variant or derivative thereof can be translated; 
_ and culturing the host cell for a time and under conditions sufficient to 
20 enhance the level and/or functional activity of said albicidin. 

Preferably, the method further comprises introducing into said host cell a vector 
from which a PPTase can be translated. Suitably, the PPTase is selected from EntD or 
XabA. 

Preferably, the method further comprises introducing into said host cell a vector 
25 from which a methyltransferase, more preferably and O-methyltransferase, and even more 
preferably an 5-adenosylmethionine O-methyltransferase can be translated. 

According to another aspect of the invention, there is provided a method for 
enhancing the level and/or functional activity of an albicidin, said method comprising 
contacting a precursor of said albicidin or an intermediate involved in the biosynthesis of 
30 said albicidin with at least a portion of an albicidin PKS-NRPS, or variant or derivative 
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thereof, as broadly described above, for a time and under conditions sufficient to enhance 
the level and/or functional activity of said albicidin. 

Preferably, the method further comprises contacting a precursor of said albicidin 
or an intermediate involved in the biosynthesis of said albicidin with a PPTase. 

5 Preferably, the method further comprises contacting a precursor of said albicidin 

or an intermediate involved in the biosynthesis of said albicidin with a methyltransferase, 
more preferably and 0-methyltransferase, and even more preferably an S- 
adenosylmethionine O-methyltransferase. 

In another aspect, the invention provides a method of identifying a PPTase for 
10 enhancing the level and/or functional activity of an albicidin, said method comprising:! 
introducing into an albicidin-deficient strain of X. albilineans which lacks xabA a vector 
comprising a polynucleotide encoding a test PPTase, wherein said polynucleotide is 
operably linked to a regulatory polynucleotide, and detecting production of albicidin. 

Suitably, the strain is LSI 56 described herein. 

1 5 Preferably, the PPTase is EntD. 

The invention, in another aspect, also provides a method for enhancing the level 
and/or functional activity of an albicidin, said method comprising: 

- introducing into an albicidin-producing host cell (1) an agent that modulates 
the expression of a gene encoding at least a portion of a PPTase associated with 

20 albicidin biosynthesis or variant or derivative thereof, or the level and/or functional 

activity of an expression product of said gene, or (2) a vector from which a 
polynucleotide encoding at least a portion of a PPTase associated with albicidin 
biosynthesis or variant or derivative thereof can be translated; 

- and culturing the host cell for a time and under conditions sufficient to 
25 enhance the level and/or functional activity of said albicidin 

In yet another aspect, the invention provides a method for enhancing the level 
and/or functional activity of an albicidin, said method comprising: 

- introducing into an albicidin-producing host cell (1) an agent that modulates 
the expression of a gene encoding at least a portion of a methyltransferase associated 
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with albicidin biosynthesis or variant or derivative thereof, or the level and/or 
functional activity of an expression product of said gene, or (2) a vector from which 
a polynucleotide encoding at least a portion of a methyltransferase associated with 
albicidin biosynthesis or variant or derivative thereof can be translated; 
5 - and culturing the host cell for a time and under conditions sufficient to 

enhance the level and/or functional activity of said albicidin 

In another aspect, the invention resides in an antigen-binding molecule that is 
immuno-interactive with a polypeptide, fragment, variant or derivative as broadly 
described above. 

10 In yet another aspect, the invention provides a method to prepare a polynucleotide 

encoding a modified PKS, comprising using an albicidin PKS-NRPS encoding nucleotide 
sequence as a scaffold and modifying (he portions of the nucleotide sequence that encode 
enzymatic activities, either by mutagenesis, inactivation, deletion, insertion, or 
replacement. 

15 Li still yet another aspect, the invention contemplates a method for producing 

polyketides, comprising expressing the modified albicidin PKS encoding nucleotide 
sequence as broadly described in a suitable host cell to thereby produce a polyketide 
different from that produced by the albicidin PKS-NRPS. 

Another aspect of the invention contemplates the insertion of portions of the 
20 albicidin PKS-NRPS coding sequence into other PKS coding sequences to modify the 
products thereof. 

In a further aspect, the invention encompasses use of the polypeptide, fragment, 
variant or derivative as broadly described above, or the polynucleotide or vector as broadly 
described above, or the modulatory agent as broadly described above for producing 
25 secondary metabolites, preferably albicidins. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a schematic representation showing a physical and functional map of 
part of the albicidin biosynthetic gene cluster. (A). Partial physical map of the Tn5 
insertion locus in LS157 genomic DNA. Restriction enzymes used: C, C/al; E, EcoRI; S, 

5 Spel; N, Notl\ and B, BamHL (B). Probes used to recover clone pXABB: Probe 1, 1.4-kb 
EcoRI-Notl fragment digested from pBC157; and probe 2, 0.9-kb PGR product amplified 
from Xal3 genomic DNA using primers complementary to sequences flanking the Tn5 
insertion in LSI 57. (C). Clones and subclones used for sequencing, and described in Table 
1. (D). The transcription directions of three putative ORFs in 16.5-kb EcoRI fragment are 

10 indicated by arrows. (E). Organisation of X albilineans XabB constructed by comparison 
with known protein sequences. The unshaded box indicates PKS region, and the shade box 
indicates NRPS region. Relative positions of potential catalytic domains or active sites are 
indicated by: AL, acyl-CoA ligase; ACP, acyl carrier protein; KS, 0-ketoacyl synthase; 
KR, jS-ketoacyl reductase; PCP, peptidyl carrier protein; C, condensation; A, adenylation. 

1 5 Horizontal bars indicate proposed biosynthetic modules. 

Figure 2 is a diagrammatic representation presenting the sequence of the region 
upstream from xabB. The nucleotide sequence is numbered according to the 165 11 -bp 
sequence in GenBank accession no. AF239749. The putative -35 and -10 promoter 
sequences of xabB and the divergent gene xatA are underlined, as are ribosome-binding 
20 sequences. The transcriptional directions of xabB and xatA are indicated by arrows. 
Translational start codons are indicated by boldface type. Primers P1F1 and P1R are 
shaded. 

Figure 3 is a diagrammatic representation showing the alignment of X albilineans 
XabB enzymatic domains with those of PKSs and FASs from other organisms. Identical 

25 amino acids are indicated by boldface type. Stars and overlines identify conserved amino 
acids at catalytic sites. Xal-XabB, X albilineans XabB for biosynthesis of albicidin (this 
study); Hin-LCFA, Haemophilus influenza long-chain fatty acid-CoA ligase (P46450); 
Bsu-PksJ, B. subtilis polyketide synthase J (P40806); Bsu-MycA, B. subtilis MycA for 
biosynthesis of mycosubtilin (AF1 84956); Pcr-ComL2, Petroselinum crispum 4- 

30 coumarate-CoA ligase 2 (P14913); Sma-FkbB, S. sp. MA6548 FkbB for biosynthesis of 
FK506 (AF082099); Ame-RifA, Amycolatopsis mediterranei RifA for biosynthesis of 
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rifamycin B (AF040570); Shy-RapA, & hygroscopicus RapA for biosynthesis of 
rapamycin (X86780); Mxa-Tal, M xanthus Tal for biosynthesis of TA (AJ006977); Ser- 
EryAl and EryA3, S. erythraea EryA modules for biosynthesis of erythromycin (M63676, 
M63677); Che-PKSl, Cochliobolus heterostrophus PKS1 for biosynthesis of T-toxin 
5 (U68040); Bsu-PksM, B. subtilis PKS for a polyketide synthase (031781); Mtu-PpsA, M 
tuberculosis PKS for a polyketide synthase (G3261605); Mtu-MAS, M. tuberculosis MAS 
for biosynthesis of mycocerosic acid (M95808); Chick-FAS, chichen fatty acid synthase 
(M22987); Rat-FAS, rat fetty acid synthase (X14175). 

Figure 4 is a graphical representation showing albicidin production by wild-type 
10 X albilineans LS155 (A), complemented Tox" mutant strain LS157 pLXABBl (O), 
complemented Tox mutant strain LS157 pLXABB2 (•), LS157 (■), and LS157 
pLAFR3 (+). Albicidin concentrations in culture supematants were quantified based on 
inhibition zone width in a microbial bioassay (means +/- standard errors from 5 replicates). 

Figure 5 is a graphical representation showing the relationship between growth 
15 (■), albicidin production (O), and GUS activity (A) in X albilineans LS155 pRG960pl 
(A) and in LS155 pRG960p2 (B). Relative activity (means +/- standard errors from 2 
replicates)* 100% growth, OD 55 o - 1.43; 100% albicidin production = 268.5 units/ml; 
100% GUS activity = 119 units/mg of protein (one unit equals 1 pmol of 
melfaylumbelliferone formed per min.). Locations and sizes of inserts on pRG960pl and 
20 pRG960p2 are indicated in Figure 2 and Table 1. GUS, /^-glucuronidase. 

Figure 6 is a schematic representation showing the organisation of five known 
PKS-NRPS enzymes. X, albilineans XabB, encoded by xabB for albicidin biosynthesis 
(this study); B. subtilis MycA for mycosubtilin biosynthesis (Duitman et al y 1999); 
Yersinia pestis HMWP1 for yersiniabactin biosynthesis (Gehring et a/., 1998); M. xanthus 
25 partial gene product Tal for TA biosynthesis (Paitan et al. 9 1999); B. subtilis PksorfX6 for 
unknown function (Albertini et ai 9 1995). Unshaded boxes indicate PKS regions, grey 
boxes indicate NRPS regions, and dark boxes indicate amino transferase (AMT) or 
methyltransferase (MT). Vertical bars follow the carrier domains at the end of each 
biosynthetic "module" 

30 Figure 7 is a diagrammatic representation showing a dendrogram (GCG) analysis 

of adenylation domains of XabB and its homologous peptide synthetases. Peptide 
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synthetases, including various modules of the same multienzyme complex, are as follows: 
GrsA and GrsB, gramicidin synthetase A and B, respectively, from B. subtilis (X15577, 
X61658); BacA, BacB, and BacC, bacitracin synthetase A, B, and C, respectively, from B. 
licheniformis (AF007865); SnbC and SnbDE, pristinamycin I synthetase C and DE, 

5 respectively, from S. pristimespiralis (X98690, Yl 1547); FkbP, FK506 synthetase FkbP 
from S. sp. MA6548 (AF082100); TycA, TycB, and TycC, tyrocidine synthetase A, B, and 
C, respectively, from B. brevis (AF004835); SyrE, syringomycin synthetase El from 
Pseudomonas syringae pv. syringae (AF047828); EntF, enterobactin synthetase F from E. 
coli (PI 1454); DhbF, 2,3-dihydroxybenzoate synthetase F from B. subtilis (P45745); 

10 FenD, fengycin synthetase FenDl from B. subtilis (AJ01 1 849); SrfAA, SrfAB, and SrfAC, 
surfactin A synthetase A, B, and C, respectively, from B. subtilus (X70356); XabB, 
albicidin synthase B from X. albilineans (this study). The A4 to A5 regions (about 100 aa) 
of adenylation domains of peptide synthetases, which is involved in amino acid recognition 
and binding, were aligned using the PUJEUP program with default parameters. 

15 Figure 8 is a diagrammatic representation showing a restriction map of clones 

including the xabA gene from X albilineans. Sequencing by primer walking commenced at 
the T3 and T7 primers. The location and direction of transcription of the xabA ORF is 
shown by an arrow. Restriction enzymes ate: E, EcoRJ> P, Fsit. C, Cla\\ ar.u K, mndHT 

Figure 9 is a diagrammatic representation presenting the sequence of the xabA 
20 gene. The nucleotide sequence is numbered according to the 3-kb sequence in GenBank 
accession no. AF191324. The closest matches to RBS region and promoter consensus 
sequences are underlined, as are the region of dyad symmetry and putative factor- 
independent teimination sites. Translation start and stop codons are indicated by boldface 
type. The (V/I)G(V/I)D and (F/W)(S/Cn>KE(A/S)xxK motifs conserved in PPTase 
25 enzymes are boxed. The insertion site of Tn5 is marked (▼). 

Figure 10 is a graphical representation showing albicidin production by wild-type 
X. albilineans strain Xal3 (O), Xal3 pLXABA (•), and complemented Tox" mutant strain 
LSI 56 pLXABA (A). Albicidin concentrations in culture supernatants were quantified 
based on inhibition zone width in a microbial bioassay (means +/- standard errors from 2 
30 replicates). 
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Figure 1 1 is a schematic representation showing a dendrogram (GCG) analysis of 
PPTases involved in antibiotic and fatty acid biosynthesis in bacteria. Sau, Salmonella 
austin\ Sty, Salmonella typhymurium; Bbr, Bacillus brevis\ Xal, Xanthomonas albilineans; 
Eco, Escherichia coli\ Sfl, Shigella flexneri; Bpu, Bacillus pumilus; Bsu, Bacillus subtilis; 
5 Mtu, Mycobacterium tuberculosis; Hin, Haemophilus influenzae. The sources of amino 
acid sequence of PPTases correspond to those in Table 2, and the sequences were aligned 
using the PBLEUP program with default parameters. 

Figure 12 is a schematic representation showing the organisation of part of the 
albicidin biosynthetic gene cluster. The location and direction of three ORFs are indicated 

10 by thick arrows. Vertical lines indicate the position of restriction enzyme sites: E, EcoRl; 
B, BamHl; S, Spel; N, Ncol. The vertical lines with triangles ( A ) show the position of 
insertional mutagenesis sites or Tn5 insertion site, and the resultant mutants are bracketed. 
The arrows above the physical map indicate the locations of primers used to amplify 
sequence downstream of the EcoRI restriction site by IPCFL The cloned regions for 

1 5 complementation tests are shown below the map. 

Figure 13 is a diagrammatic representation presenting the nucleotide and deduced 
amino acid sequences of the xabC region. The nucleotide sequence is numbered according 
to the 1515-bp sequence in GenBank accession no. AF239750. The potential RBS and 
selected restriction sites are underlined. The putative factor-independent termination 
20 signals are underlined and indicated by bold letters. Translation start and stop codons are 
indicated by bold letters. The conserved motifs in Mtases are boxed Primers used for PCR 
(A3F and A3R) and IPCR (IR) are shaded. 

Figure 14 is a diagrammatic representation showing the conserved sequence 
motife in Mtases involved in antibiotic biosynthesis in bacteria. Identical or similar amino 

25 acids (A = G; D = E; I = L = V) are shown in bold. Numbers indicate amino acid residues 
from the N terminus of the protein. Xal-XabC, putative albicidin biosynthesis Mtase from 
X albilineans (this study); Sgl-TcmO and Sgl-TcmN, multifunctional cyclase-dehydrase- 
3-0-Mtase and tetracenomycin polyketide synthesis 8-0-Mtase of Streptomyces 
glaucescens. respectively (accession number M80674); Smy-MdmC, midecamycin-0- 

30 Mtase of S. mycarofaciens (M93958); Mxa-SafC, saframycin O-Mtase of Myxococcus 
xanthus (U24657); Ser-EryG, erythromycin biosynthesis O-Mtase of Saccharopolyspora 
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er>///iraea (SI 8533); Spe-DauK, canninomycin 4-0-Mtase from S. peucetius (L13453); 
Sal-DmpM, O-demethylpuromycin-O-Mtase ftom S. alboniger (M74560); Shy-RapM, 
rapamycin O-Mtase of S. hygroscopicus ^86780); Sav-AveD, avennectin B 5-O-Mtase 
from S. avermitilis (G5921 167). 

5 Figure 1 5 is a graphical representation showing albicidin production by wild-type 

x aibiiineans LS155 (•), Tox xabC insertion mutant LS-JP2 (■), complemented strain 
LS-JP2 pLXABC containing Lac promoter - full length xabC gene (O), and complemented 
strain LS-JP2 pLXABBl containing fall length xabB plus functional N-terminal region of 
xabC (Jl). Albicidin concentrations in culture supematants were quantified based on 

10 inhibition zone width in a microbial bioassay (means +/- standard errors from 2 or 3, 
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BRIEF DESCRIPTION OF THE SEQUENCES: SUMMARY TABLE 



TABLE A 



SEQUENCER) 
NUMBER 


SEQUENCE 


LENGTH 


SEQ JD NO: 1 


Full-length (Accession No. AF239749) 


16551 bases 


SEQ JD NO: 2 


Full-length polypeptide sequence encoded by SEQ 
R>NO:l 


4801 residues 


SEQ ID NO: 3 


Full-length coding sequence oixabB 


14406 bases 


SEQ ff> NO: 4 


Polypeptide sequence encoded by hMKl ID MU. 3 


^foui resiuues 


SEQ ID NO: 5 


Sub-sequence of SEQ ID NO: 1 and 3 encoding acyl- 
CoA ligase subdomain I 


45 bases 


SEQ R) NO: 6 


Acyl-CoA ligase subdomain I encoded by SEQ ID 
NO: 5 


15 residues 


SEQ ID NO: 7 


Sub-sequence of SEQ ID NO: 1 and 3 encoding acyl- 
CoA ligase subdomain II 


24 bases 


SEQ JD NO: 8 


Acyl-CoA ligase subdomain I encoded by SEQ ID 
NO: 7 


8 residues 


SEQ ID NO: 9 


Sub-sequence of SEQ ID NO: 1 and 3 encoding j5- 
ketoacyl synthase 1 subdomain I 


51 bases 


SEQ JD NO: 10 


0-Ketoacyl synthase 1 subdomain I encoded by SEQ 
ID NO: 9 


17 residues 


SEQ ID NO: 11 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 0- 
Keioacyi symnase i suouomain u 


30 bases 


SEQ ID NO: 12 


jS-Ketoacyl synthase 1 subdomain II encoded by SEQ 
ID NO: 11 


10 residues 


SEQ JD NO: 13 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 0- 
ketoacyl synthase 1 subdomain m 


30 bases 


SEQ JD NO: 14 


j8-Ketoacyl synthase 1 subdomain III encoded by 
SEQ ID NO: 13 


10 residues 


SEQ ID NO: 15 


Sub-sequence of SEQ ID NO: 1 and 3 encoding jS- 
ketoacyl synthase 2 subdomain I 


51 bases 
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1 SEQUENCE ID 
NUMBER 


SEQUENCE 


LENGTH 


1 SEQIDNO: 16 


0-Ketoacyl synthase 2 subdomain I encoded by SEQ 
ID NO: 15 j 


17 residues 


SEQIDNO: 17 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 0- 
ketoacyl synthase 2 subaomain 11 | 


30 bases 


SEQIDNO: 18 


0-Ketoacyl synthase 2 subdomain II encoded by SEQ 
ID NO: 17 j 


10 residues 


SEQIDNO: 19 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 0- 
ketoacyl synthase 2 subdomain HI j 


30 bases 


SEQ ID NO: 20 


0-Ketoacyl synthase 2 subdomain IE encoded by 
SEQIDNO: 19 


10 residues 


SEQIDNO: 21 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 0- 
ketoacyl reductase domain 


93 bases 


SEQ ID NO: 22 


/3-Ketoacyl reductase domain encoded by SEQ ID 
NO: 21 | 


31 residues 


SEQ ID NO: 23 


Sub-sequence of SEQ ID NO: 1 and 3 encoding acyl 
carrier protein 1 domain 


36 bases 


SEQ ID NO: 24 


Acyi cairier protein i domain en&Kte«i by SEQ £P 
NO: 23 




SEQ ID NO: 25 


Sub-sequence of SEQ ID NO: 1 and 3 encoding acyl 
carrier protein 2 domain 


36 bases 


SEQ ID NO: 26 


Acyl carrier protein 2 domain encoded by SEQ ID 
NO: 25 


12 residues 


SEQ ID NO: 27 


Sub-sequence of SEQ ID NO: 1 and 3 encoding acyl 
carrier protein 3 domain 


36 bases 


SEQ ID NO: 28 


Acyl carrier protein 3 domain encoded by SEQ ID 
NO: 27 


12 residues 


SEQ ID NO: 29 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 
adenylation domain subdomain I 


18 bases 


SEQ ED NO: 30 


Adenylation domain subdomain I encoded by SEQ 
ID NO: 29 


6 residues 


SEQ ID NO: 31 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 
adenylation domain subdomain II 


; 33 bases 1 
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SEQUENCE ID 
NUMBER 


SEQUENCE 


LENGTH 


SEQ ID NO: 32 


Adenylation domain subdomain II encoded by SEQ 
ID NO: 31 


1 residues 


SEQIDNO: 33 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 
adenylation domain subdomain HI 


48 bases 


SEQ ID NO: 34 


Adenylation domain subdomain HI encoded by SEQ 
ID NO: 33 


16 residues 


SEQIDNO: 35 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 
adenylation domain subdomain IV 


12 bases 


SEQ ID NO: 36 


Adenylation domain subdomain IV encoded by SEQ 
ID NO: 35 


4 residues 


SEQ ID NO: 37 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 
adenylation domain subdomain V 


21 bases 


SEQ ID NO: 38 


Adenylation domain subdomain V encoded by SEQ 
ED NO: 37 


7 residues 


SEQ ID NO: 39 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 
adenylation domain subdomain VI 


45 bases 


35QiDNU:40 


Adenylation domain subdouiain VI encoded by SEQ 
ID NO: 39 


4 5 residues 


SEQIDNO: 41 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 
adenylation domain subdomain VII 


18 bases 


SEQ ID NO: 42 


Adenylation domain subdomain VH encoded by SEQ 
ID NO: 41 


6 residues 


SEQ ID NO: 43 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 
adenylation domain subdomain VHI 


60 bases 


SEQ ID NO: 44 


Adenylation domain subdomain VIII encoded by 
SEQIDNO: 43 


20 residues 


SEQIDNO: 45 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 
adenylation domain subdomain DC 


21 bases 


SEQ ID NO: 46 


Adenylation domain subdomain DC encoded by SEQ 
ID NO: 45 


7 residues 


SEQ ID NO: 47 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 
adenylation domain subdomain X 


18 bases 
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SEQUENCE ID 
NUMBER 


SEQUENCE 


LENGTH 


SEQ ID NO: 48 


Adenylation domain subdomain X encoded by SEQ 
ID NO: 47 


6 residues 


SEQ ID NO: 49 


Sub-sequence of SEQ ED NO: 1 and 3 encoding 
peptidyl carrier protein 1 domain 


33 bases 


SEQ ID NO: 50 


Peptidyl carrier protein 1 domain encoded by SEQ 
ED NO: 49 


11 residues 


SEQ ID NO: 51 


Sub-sequence of SEQ ED NO: 1 and 3 encoding 
peptidyl carrier protein 2 domain 


33 bases 


SEQ ID NO: 52 


Peptidyl carrier protein 2 domain encoded by SEQ 
ID NO . 51 


11 residues 


SEQ ID NO: 53 


Sub-sequence of SEQ ED NO: 1 and 3 encoding 
condensation domain 1 subdomain I 


30 bases 


SEQ ID NO: 54 


Condensation domain 1 subdomain I encoded by 
SEQ ID NO: 53 


10 residues 


SEQ ID NO: 55 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 
condensation domain 1 subdomain II 


27 bases 


SEQ ID NO: 56. . 


JEkiidetisatiaix domain t subdomain II er.uxk/0 by 
SEQ ID NO: 55 


;9:icsiducu 


SEQ ID NO: 57 


Sub-sequence of SEQ ID NO: I and 3 encoding 
condensation domain 1 subdomain EQ 


30 bases 


SEQ ID NO: 58 


Condensation domain 1 subdomain EH encoded by 
SEQ ID NO: 57 


10 residues 


SEQ ID NO: 59 


Sub-sequence of SEQ ID NO: I and 3 encoding 
condensation domain 1 subdomain IV 


21 bases 


SEQ ID NO: 60 


Condensation domain 1 subdomain EV encoded by 
SEQ ID NO: 59 


7 residues 


SEQ ID NO: 61 


Sub-sequence of SEQ ED NO: 1 and 3 encoding 
condensation domain i suoaomain v 


36 bases 


SEQ ED NO: 62 


Condensation domain 1 subdomain V encoded by 
SEQ ID NO: 61 


12 residues 


SEQ ID NO: 63 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 
condensation domain 1 subdomain VI 


21 bases 
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SEQUENCE ID 
NUMBER 


SEQUENCE 


LENGTH 


SEQ ID NO: 64 


Condensation domain 1 subdomain VI encoded by 
SEQ ID NO: 63 


7 residues 


SEQ ID NO: 65 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 
condensation domain 1 subdomain VII 


24 bases 


SEQ ID NO: 66 


Condensation domain 1 subdomain VII encoded by 
SEQ ID NO: 65 


8 residues 


SEQ ID NO: 67 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 
condensation domain 2 subdomain I 


30 bases 


SEQ ID NO: 68 


Condensation domain 2 subdomain I encoded by 
SEQ ID NO: 67 


10 residues 


SEQ ID NO: 69 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 
condensation domain 2 subdomain II 


27 bases 


SEQ ID NO: 70 


Condensation domain 2 subdomain II encoded by 
SEQ ID NO: 69 


9 residues 


SEQ ID NO: 71 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 
condensation domain 2 subdomain HI 


30 bases 


SEQ ID NO: -72 ' 


Condensation domain ,.7. subdOTiaui III encoded ^ 
SEQ ID NO : 71 


.10 residues v 


SEQ ID NO: 73 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 
condensation domain 2 subdomain IV 


21 bases 


SEQ ID NO: 74 


Condensation domain 2 subdomain IV encoded by 
SEQ ID NO: 73 


7 residues 


SEQ ID NO: 75 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 
condensation domain 2 subdomain V 


33 bases 


SEQ ID NO: 76 


Condensation domain 2 subdomain V encoded by 
SEQ ID NO: 75 


11 residues 


SEQ ID NO: 77 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 
condensation domain 2 subdomain VI 


21 bases 


SEQ ID NO: 78 


Condensation domain 2 subdomain VI encoded by 
SEQ ID NO: 77 


7 residues 


SEQ ID NO: 79 


Sub-sequence of SEQ ID NO: 1 and 3 encoding 
condensation domain 2 subdomain VII 


24 bases 
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SEQUENCE ID 
NUMBER 


SEQUENCE 


LENGTH 


SEQ ID NO: 80 


Condensation domain 2 subdomain VII encoded by 
SEO ID NO- 79 


8 residues 


QFOTD NO- 81 

tjiJf\J JUL/ 1* V#. *-> A 


Polvnucleotide comDrisinc xabB promoter 


242 bases 


ct?o TD NO- 82 


Full-length *aM (Accession No. AF191324) 


1200 bases 


SEQ ID NO: 83 


Full-length polypeptide sequence encoded by SEQ 

JUL/ liU. OZ, 


278 residues 




Pnll-lf^notVi P-ndinp Qenuence of xahA 


837 bases 


SEQ ID NO: 85 


Polypeptide sequence encoded by SEQ ID NO: 84 


278 residues 


SEQ ID NO: 86 


Sub-sequence of SEQ ID NO: 82 encoding PPTase 
domain 


168 bases : 


SEQ ID NO: 87 


PPTase domain encoded by SEQ ID NO: 86 


56 residues 


SEQ ID NO: 88 


Sub-sequence of SEQ ID NO: 82 encoding a motif 
(motif I) conserved in PPTases 


27 bases 


SEQ ID NO: 89 


PPTase motif I amino acid sequence encoded by 
SEQ ID NO: 88 


9 residues 


SEQ ID NO: 90 


Sub-sequence of SEQ ID NO: 82 encoding 
intervening amino acid sequence linking motifs I and 

n 


117 bases 


SEO ID NO: 91 


Intervening amino acid sequence encoded by SEQ ID 
NO: 90 


39 residues 


SEQ ID NO: 92 


Sub-sequence of SEQ ID NO: 82 encoding a motif 
(motif D) conserved in PPTases 


36 bases 


SEQ ID NO: 93 


PPTase motif II amino acid sequence encoded by 
S1FO ID NO* 92 


12 residues 


SEO ID NO - 94 


Full-length xabC (Accession No. AF239750) 


ISIS bases 


SEQ ED NO: 95 


Full-length polypeptide sequence encoded by SEQ 
ID NO: 94 


343 residues 


SEQ ID NO: 96 


Full-length coding sequence of xabC 


1029 bases 


SEQ ID NO: 97 


Polypeptide sequence encoded by SEQ ID NO: 96 


343 residues 
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SEQUENCE ID 
NUMBER 


SEQUENCE 


LENGTH 


SEQ ID NO: 98 


Sub-sequence of SEQ ID NO: 94 encoding a motif 
(motif I) conserved in methyltransferases 


21 bases j 


SEQ ID NO: 99 


Methyltransferase motif I amino acid sequence 
encoded by SEQ ID NO: 98 


7 residues 


SEQ ID NO: 100 


Sub-sequence of SEQ ID NO: 94 encoding a motif 
(motif H) conserved in methyltransferases 


24 bases 


SEQ ID NO: 101 


Methyltransferase motif II amino acid sequence 
encoded by SEQ ID NO: 100 


8 residues 


SEQ ID NO. 102 


Sub-sequence of SEQ ID NO: 94 encoding a motif 
(motif m) conserved in methyltransferases 


27 bases 


<?FO TD NO- 103 


Methyltransferase motif III amino acid sequence 
encoded by SEQ ID NO: 102 


9 residues 


SEQ TD NO: 104 


Polynucleotide encoding said motifs I, II and III 


303 bases 


SEQ ID NO: 105 


Polypeptide encoded by SEQ ID NO: 104 


101 residues 


SEQ ID NO: 106 


Biologically active fragment of SEQ ID NO: 94 


831 bases 


SEQ ID NO: 107 


Biologically active feagm^t of £50 ID NO: 95 


277 residues 
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DET AILED DESCRIPTION OF THE INVENTION 
1. Definitions 

Unless defined otherwise, all technical and scientific terms used herein have the 
same meaning as commonly understood by those of ordinary skill in the art to which the 
5 invention belongs. Although any methods and materials similar or equivalent to those 
described herein can be used in the practice or testing of the present invention, preferred 
methods and materials are described. For the purposes of the present invention, the 
following terms are defined below. 

The articles "a " and "an " are used herein to refer to one or to more than one (i.e. 
10 to at least one) of the grammatical object of the article. By way of example, "an element" 
means one element or more than one element. 

The term "about" is used herein to refer to sequences that vary by as much as 
30%, preferably by as much as 20% and more preferably by as much as 10% to the length 
of a reference sequence. 

15 By "agent" is meant a naturally occurring or synthetically produced molecule 

which interacts either directly or indirectly with a target member, the level and/or 
functional activity of which are to be modulated. 

"Amplification product " refers to a nucleic acid product generated by nucleic acid 
amplification techniques. 

20 By "antigen-binding molecule " is meant a molecule that has binding affinity for a 

target antigen. It will be understood that this term extends to immunoglobulins, 
immunoglobulin fragments and non-immunoglobulin derived protein frameworks that 
exhibit antigen-binding activity. 

As used herein, the term "binds specifically" and the like refers to antigen- 
25 binding molecules that bind the polypeptide or polypeptide fragments of the invention but 
do not significantly bind to homologous prior art polypeptides. 
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By "biologically active fragment" is meant a fragment of a full-length parent 
polypeptide which fragment retains the activity of the parent polypeptide. A biologically 
active fragment will therefore comprise an activity selected form the group consisting of 
acyl-CoA ligase activity, 0-ketoacyl synthase activity, 0-ketoacyl reductase, acyl carrier 

5 protein activity, adenylation activity, peptidyl carrier protein activity, condensation 
activity, PPTase activity and methyltransferase activity. As used herein, the term 
"biologically active fragment" includes deletion mutants and small peptides, for example 
of at least 10, preferably at least 20 and more preferably at least 30 contiguous amino 
acids, which comprise the above activities. Peptides of this type may be obtained through 

10 the application of standard recombinant nucleic acid techniques or synthesised using 
conventional liquid or solid phase synthesis techniques. For example, reference may be 
made to solution synthesis or solid phase synthesis as described, for example, in Chapter 9 
entitled "Peptide Synthesis " by Atherton and Shephard which is included in a publication 
entitled "Synthetic Vaccines" edited by Nicholson and published by Blackwell Scientific 

15 Publications. Alternatively, peptides can be produced by digestion of a polypeptide of the 
invention with proteinases such as endoLys-C, endoArg-C, endoGlu-C and staphylococcus 
V8-protease. The digested fragments can be purified by, for example, high performance 
liquid chromatographic (HPLC) techniques. 

Throughout this specification, unless the context requires otherwise, the words 
20 "comprise "comprises " and "comprising " will be understood to imply the inclusion of a 
stated step or element or group of steps or elements but not the exclusion of any other step 
or element or group of steps or elements. 

By "corresponds to" or "corresponding to" is meant a polynucleotide (a) having a 
nucleotide sequence that is substantially identical or complementary to all or a portion of a 
25 reference polynucleotide sequence or (b) encoding an amino acid sequence identical to an 
amino acid sequence in a peptide or protein. This phrase also includes within its scope a 
peptide or polypeptide having an amino acid sequence that is substantially identical to a 
sequence of amino acids in a reference peptide or protein. 

By "derivative" is meant a polypeptide that has been derived from the basic 
30 sequence by modification, for example by conjugation or complexing with other chemical 
moieties or by post-translational modification techniques as would be understood in the art. 
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The term "derivative" also includes within its scope alterations that have been made to a 
parent sequence including additions, or deletions that provide for functionally equivalent 
molecules. Accordingly, the term derivative encompasses molecules that will have an 
activity selected form the group consisting of acyl-CoA ligase activity, /3-ketoacyl synthase 
5 activity, j3-ketoacyl reductase, acyl carrier protein activity, adenylation activity, peptidyl 
carrier protein activity, condensation activity, PPTase activity and methyltransferase 
activity. 

"Homology" refers to the percentage number of amino acids that are identical or 
constitute conservative substitutions as defined in Table B infra. Homology may be 
10 determined using sequence comparison programs such as GAP (Deveraux et al. 1984, 
Nucleic Acids Research 12, 387-395). In this way, sequences of a similar or substantially 
different length to those cited herein might be compared by insertion of gaps into the 
alignment, such gaps being determined, for example, by the comparison algorithm used by 
GAP. 

15 "Hybridisation* 9 is used herein to denote the pairing of complementary nucleotide 

sequences to produce a DNA-DNA hybrid or a DNA-RNA hybrid. Complementary base 
sequences are those sequences that are related by the base-pairing rules. In DNA, A pairs 
with T and C pairs with G. In RNA U pairs with A and C pairs with G. In this regard, the 
terms "match" and "mismatch" as used herein refer to the hybridisation potential of paired 
20 nucleotides in complementary nucleic acid strands. Matched nucleotides hybridise 
efficiently, such as the classical A-T and G-C base pair mentioned above. Mismatches are 
other combinations of nucleotides that do not hybridise efficiently. 

Reference herein to "immuno-interactive" includes reference to any interaction, 
reaction, or other form of association between molecules and in particular where one of the 
25 molecules is, or mimics, a component of the immune system. 

By "immuno-interactive fragment" is meant a fragment of a parent or reference 
polypeptide as described herein, which fragment elicits an immune response, including the 
production of elements that specifically bind to said polypeptide, or variant or derivative 
thereof. As used herein, the term "immuno-interactive fragment " includes deletion mutants 
30 and small peptides, for example of at least six, preferably at least 8 and more preferably at 
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least 20 contiguous amino acids, which comprise antigenic determinants or epitopes. 
Several such fragments may be joined together. 

By "isolated" is meant material that is substantially or essentially free from 
components that normally accompany it in its native state. For example, an "isolated 
5 polynucleotide", as used herein, refers to a polynucleotide, which has been purified from 
the sequences which flank it in a naturally occurring state, e.g. 9 a DNA fragment which has 
been removed from the sequences which are normally adjacent to the fragment. 

By "modulating" is meant increasing or decreasing, either directly or indirectly, 
the level and/or functional activity of a target molecule. For example, an agent may 
10 indirectly modulate the said level/activity by interacting with a molecule other than the 
target molecule. In this regard, indirect modulation of a gene encoding a target polypeptide 
includes within its scope modulation of the expression of a first nucleic acid molecule, 
wherein an expression product of the first nucleic acid molecule modulates the expression 
of a nucleic acid molecule encoding the target polypeptide. 

15 By "obtained from " is meant that a sample such as, for example, a nucleic acid 

extract or polypeptide extract is isolated from, or derived from, a particular source. For 
example, the extract may be isolated directly from any organism that produces secondary 
metabolites, preferably from an albicidin-producing microorganism, more preferably from 
microorganisms of the genus Xanthomonas. 

20 The term "oligonucleotide" as used herein refers to a polymer composed of a 

multiplicity of nucleotide units (deoxyribonucleotides or ribonucleotides, or related 
structural variants or synthetic analogues thereof) linked via phosphodiester bonds (or 
related structural variants or synthetic analogues thereof). Thus, while the term 
"oligonucleotide" typically refers to a nucleotide polymer in which the nucleotides and 

25 linkages between them are naturally occurring, it will be understood that the term also 
includes within its scope various analogues including, but not restricted to, peptide nucleic 
acids (PNAs), phosphoramidates, phosphorothioates, methyl phosphonates, 2-O-methyl 
ribonucleic acids, and the like. The exact size of the molecule may vary depending on the 
particular application. An oligonucleotide is typically rather short in length, generally from 

30 about 10 to 30 nucleotides, but the term can refer to molecules of any length, although the 
term "polynucleotide" or "nucleic acid" is typically used for large oligonucleotides. 
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By "operabty linked" is meant that transcriptional and translational regulatory 
nucleic acids are positioned relative to a polypeptide-encoding polynucleotide in such a 
manner that the polynucleotide is transcribed and the polypeptide is translated. 

The term "polynucleotide" or "nucleic acid" as used herein designates mRNA, 
5 RNA, cRNA, cDNA or DNA. The term typically refers to oligonucleotides greater than 30 
nucleotides in length. 

The terms "polynucleotide variant" and "variant" refer to polynucleotides 
displaying substantial sequence identity with a reference polynucleotide sequence or 
polynucleotides that hybridise with a reference sequence under stringent conditions that are 

10 defined hereinafter. These terms also encompass polynucleotides in which one or more 
nucleotides have been added or deleted, or replaced with different nucleotides, iii this ' 
regard, it is well understood in the art that certain alterations inclusive of mutations, 
additions, deletions and substitutions can be made to a reference polynucleotide whereby 
the altered polynucleotide retains the biological function or activity of the reference 

15 polynucleotide. The terms 'polynucleotide variant" and "variant" also include naturally 
occurring allelic variants. 

"Polypeptide : y "peptide" and 'protein arc used interchangeably herein refer xc 
a polymer of amino acid residues and to variants and synthetic analogues of the same. 
Thus, these terms apply to amino acid polymers in which one or more amino acid residues 
20 is a synthetic non-naturally occurring amino acid, such as a chemical analogue of a 
corresponding naturally occurring amino acid, as well as to naturally-occurring amino acid 
polymers. 

The term "polypeptide variant" refers to polypeptides in which one or more 
amino acids have been replaced by different amino acids. It is well understood in the art 

25 that some amino acids may be changed to others with broadly similar properties without 
changing the nature of the activity of the polypeptide (conservative substitutions) as 
described hereinafter. These terms also encompass polypeptides in which one or more 
amino acids have been added or deleted, or replaced with different amino acids. 
Accordingly, polypeptide variants as used herein encompass polypeptides that have an 

30 activity selected form the group consisting of acyl-CoA ligase activity, jfMeetoacyl synthase 
activity, j8-ketoacyl reductase, acyl carrier protein activity, adenylation activity, peptidyl 
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carrier protein activity, condensation activity, PPTase activity and methyltransferase 
activity. 

By "primer" is meant an oligonucleotide which, when paired with a strand of 
DNA, is capable of initiating the synthesis of a primer extension product in the presence of 
5 a suitable polymerising agent. The primer is preferably single-stranded for maximum 
efficiency in amplification but may alternatively be double-stranded. A primer must be 
sufficiently long to prime the synthesis of extension products in the presence of the 
polymerisation agent. The length of the primer depends on many factors, including 
application, temperature to be employed, template reaction conditions, other reagents, and 
10 source of primers. For example, depending on the complexity of the target sequence, the 
oligonucleotide primer typically contains 15 to 35 or more nucleotides, although it may 
contain fewer nucleotides. Primers can be large polynucleotides, such as from about 200 
nucleotides to several kilobases or more. Primers may be selected to be "substantially 
complementary" to die sequence on the template to which it is designed to hybridise and 
1 5 serve as a site for the initiation of synthesis. By "substantially complementary", it is meant 
that the primer is sufficiently complementary to hybridise with a target nucleotide 
sequence. Preferably, the primer contains no mismatches with the template to which it is 
.... .designed to, hybridise -but, this -is., noi essential, loi ' .auMnpte> jwn^mplernsnkrty-*- 

nucleotides may be attached to the 5' end of the primer, with the remainder of the primer 
20 sequence being complementary to the template. Alternatively, non-complementary 
nucleotides or a stretch of non-complementary nucleotides can be interspersed into a 
primer, provided that the primer sequence has sufficient complementarity with the 
sequence of the template to hybridise therewith and thereby form a template for synthesis 
of the extension product of the primer. 

25 "Probe " refers to a molecule that binds to a specific sequence or sub-sequence or 

other moiety of another molecule. Unless otherwise indicated, the term "probe" typically 
refers to a polynucleotide probe that binds to another nucleic acid, often called the "target 
nucleic acid", through complementary base pairing. Probes may bind target nucleic acids 
lacking complete sequence complementarity with the probe, depending on the stringency 

30 of the hybridisation conditions. Probes can be labelled directly or indirectly. 
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The term 'recombinant polynucleotide" as used herein refers to a polynucleotide 
formed in vitro by the manipulation of nucleic acid into a form not normally found in 
nature. For example, the recombinant polynucleotide may be in the form of an expression 
vector. Generally, such expression vectors include transcriptional and translation^ 
5 regulatory nucleic acid operably linked to the nucleotide sequence. 

By "recombinant polypeptide" is meant a polypeptide made using recombinant 
techniques, Le. 9 through the expression of a recombinant polynucleotide. 

By "reporter molecule" as used in the present specification is meant a molecule 
that, by its chemical nature, provides an analytically identifiable signal that allows the 
10 detection of a complex comprising an antigen-binding molecule and its target antigen. The 
term "reporter molecule" also extends to use of cell agglutination or inhibition of 
agglutination such as red blood cells on latex beads, and the like. 

Terms used to describe sequence relationships between two or more 
polynucleotides or polypeptides include "reference sequence", "comparison window", 
15 "sequence identity", percentage of sequence identity* and "substantial identity". A 
"reference sequence" is at least 12 but frequently 15 to 18 and often at least 25 monomer 
uiiits, inclusive of nucleotides -and amino ac;d residues, in length, because two - 
polynucleotides may each comprise (1) a sequence (i.e., only a portion of the complete 
polynucleotide sequence) that is similar between the two polynucleotides, and (2) a 
20 sequence that is divergent between the two polynucleotides, sequence comparisons 
between two (or more) polynucleotides are typically performed by comparing sequences of 
the two polynucleotides over a "comparison window" to identify and compare local 
regions of sequence similarity. A "comparison window" refers to a conceptual segment of 
at least 6 contiguous positions, usually about 50 to about 100, more usually about 100 to 
25 about 1 50 in which a sequence is compared to a reference sequence of the same number of 
contiguous positions after the two sequences are optimally aligned. The comparison 
window may comprise additions or deletions gaps) of about 20% or less as compared 
to the reference sequence (which does not comprise additions or deletions) for optimal 
alignment of the two sequences. Optimal alignment of sequences for aligning a comparison 
30 window may be conducted by computerised implementations of algorithms (GAP, 
BESTFTT, FASTA, and TFASTA in the Wisconsin Genetics Software Package Release 
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7.0, Genetics Computer Group, 575 Science Drive Madison, WI, USA) or by inspection 
and the best alignment (i.e., resulting in the highest percentage homology over the 
comparison window) generated by any of the various methods selected. Reference also 
may be made to the BLAST family of programs as for example disclosed by Altschul et 
5 a/., 1997, Nucl. Acids Res. 25:3389. A detailed discussion of sequence analysis can be 
found in Unit 19.3 of Ausubel et al, "Current Protocols in Molecular Biology", John 
Wiley & Sons Inc, 1994-1998, Chapter 15. 

The term "sequence identity" as used herein refers to the extent that sequences 
are identical on a nucleotide-by-nucleotide basis or an amino acid-by-amino acid basis 

10 over a window of comparison. Thus, a "percentage of sequence identity" is calculated by 
comparing two optimally aligned sequences over the window of comparison, determining 
the number of positions at which the identical nucleic acid base (e.g., A, T, C, G, I) or the 
identical amino acid residue (e.g., Ala, Pro, Ser, Thr, Gly, VaL Leu, lie, Phe, Tyr, Trp, Lys, 
Arg, His, Asp, Glu, Asn, Gin, Cys and Met) occurs in both sequences to yield the number 

15 of matched positions, dividing the number of matched positions by the total number of 
positions in the window of comparison (i.e., the window size), and multiplying the result 
by 100 to yield the percentage of sequence identity. For the purposes of the present 
.~~Jmmtimt^seqMi^tie^.~^ .bejjtodastood. to «>eaa me '.'mutch peweat^". 
calculated by the DNASIS computer program (Version 2.5 for windows; available from 

20 Hitachi Software engineering Co., Ltd., South San Francisco, California, USA) using 
standard defaults as used in the reference manual accompanying the software. 

"Stringency" as used herein, refers to the temperature and ionic strength 
conditions, and presence or absence of certain organic solvents, during hybridisation and 
washing procedures. The higher the stringency, the higher will be the degree of 
25 complementarity between immobilised target nucleotide sequences and the labelled probe 
polynucleotide sequences that remain hybridised to the target after washing. 

"Stringent conditions" refers to temperature and ionic conditions under which 
only nucleotide sequences having a high frequency of complementary bases will hybridise. 
The stringency required is nucleotide sequence dependent and depends upon the various 
30 components present during hybridisation and subsequent washes, and the time allowed for 
these processes. Generally, in order to maximise the hybridisation rate, non-stringent 
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hybridisation conditions are selected; about 20 to 25 °C lower than the thermal melting 
point (T m ). The T m is the temperature at which 50% of specific target sequence hybridises 
to a perfectly complementary probe in solution at a defined ionic strength and pH. 
Generally, in order to require at least about 85% nucleotide complementarity of hybridised 

5 sequences, highly stringent washing conditions are selected to be about 5 to 15 °C lower 
than the T m . In order to require at least about 70% nucleotide complementarity of 
hybridised sequences, moderately stringent washing conditions are selected to le about 15 
to 30 °C lower than the T m . Highly permissive (low stringency) washing conditions may be 
as low as 50 °C below the T m , allowing a high level of mis-matching between hybridised 

10 sequences. Those skilled in the art will recognise that other physical and chemical 
parameters in the hybridisation and wash stages can also be altered to affect the outcome of 
a detectable hybridisation signal from a specific level of homology between target and 
probe sequences. Other examples of stringency conditions are described in section 3.3. 

By "vector" is meant a nucleic acid molecule, preferably a DNA molecule 

15 derived, for example, from a plasmid, bacteriophage, or plant virus, into which a nucleic 
acid sequence may be inserted or cloned. A vector preferably contains one or more unique 
restriction sites and may be capable of autonomous replication in a defined host cell 
~ incluuiifg a target eefr or ussae or~ii progenitor cell oi tissue thereof;' or -be iutegftble " 
the genome of the defined host such that the cloned sequence is reproducible. Accordingly, 

20 the vector may be an autonomously replicating vector, Le. 9 a vector that exists as an 
extrachromosomal entity, the replication of which is independent of chromosomal 
replication, e.g. f a linear or closed circular plasmid, an extrachromosomal element, a 
minichromosome, or an artificial chromosome. The vector may contain any means for 
assuring self-replication. Alternatively, the vector may be one which, when introduced into 

25 a cell, is integrated into the genome of the recipient cell and replicated together with the 
chromosome(s) into which it has been integrated. A vector system may comprise a single 
vector or plasmid, two or more vectors or plasmids, which together contain the total DNA 
to be introduced into the genome of the host cell, or a transposon. The choice of the vector 
will typically depend on the compatibility of the vector with the cell into which the vector 

30 is to be introduced. The vector may also include a selection marker such as an antibiotic 
resistance gene that can be used for selection of suitable transformants. Examples of such 
resistance genes are well known to those of skill in the art. 
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As used herein, underscoring or italicising the name of a gene shall indicate the 
gene, in contrast to its protein product, which is indicated by the name of the gene in the 
absence of any underscoring or italicising. For example, "xabB" shall mean the xabB gene, 
whereas "XabB" shall indicate the protein product of the "xabB" gene. 

5 2. Isolated polypeptides, biologically active fragments, polypeptide variants and 
derivatives 

2.1 Polypeptides of the invention 

2J.1 Albicidin synthetase 

The present inventor has also isolated a gene (xabB) encoding a large modular 

10 polyketide synthase (PKS) linked to a non-ribosomal peptide synthetase (NRPS) (predicted 
Mr 525,695). At 4801 amino acids in length, the product of xabB (XabB) is the largest 
reported PKS -NRPS. Comparison of XabB with available protein sequence databases 
reveals an N-terminal region (from Met-1 to Asp-3235) similar to many microbial modular 
PKSs, and a C-terminal region (from Pro-3236 to Asp-4801) similar to NRPSs. 

15 Recognisable PKS domains commencing at the N-terminus of XabB, are an acyl-CoA 
- ^ synthase (KS1), and- ^iictcacyl - 

reductase (KR), followed by two consecutive ACPs and one KS (Figure 1). The motife 
characteristic of these domains are aligned with those from other organisms in Figure 3. 
The AL domain shows 22-30% identity and 50-60% similarity to prokaryotic and 

20 eukaryotic aromatic acid-CoA ligases and long-chain fatty acid-CoA ligases, and contains 
the conserved adenylation core sequence (SGSSG) and the ATPase motif (TGD). The 
three ACP domains show up to 39.2% identity and 78.6% similarity to acyl carrier 
proteins, and all contain a 4*-phosphopantetheinyl binding cofactor box GxDS(I/L) 
(Hopwood and Sherman, 1990), except that A replaces G in ACPI (Figure 3). The two KS 

25 domains show up to 56.1% identity and 80.8% similarity to 0-ketoacyl synthases. Both 
contain motif GPxxxxxxxCSxSL around the active site Cys, and two His residues 
downstream of the active site Cys, in motifs characteristic of these enzymes (Donadio et 
ai 9 1991; Hopwood, 1997; Huang et aL, 1998). The KR domain shows up to 27.9% 
identity and 61.8% similarity to 0-ketoacyl reductases, and contains the NAD(P)H binding 
30 site GGxGxLG (Scrutton et al 9 1990). 
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At the C-terminus of XabB is an apparent peptide synthetase region linked to the 
PKS module via a peptidyl carrier protein (PCP) domain (Figure 1). The peptide synthetase 
region shows 31-38% identity and 60-63% similarity with members of the peptide 
synthetase family. It displays the ordered condensation, adenylation, and PCP domains 
5 typical of such multienzymes (Marahiel et ai, 1997) followed by an extra condensation 
domain. The conserved sequences, characteristic of the domains commonly found in 
peptide synthetases, are compared with those from XabB in Table 2. 

In more detail, the full-length amino acid sequence of the X. albilineans PKS- 
NRPS, presented in SEQ ID NO: 2, extends 4801 residues and includes the following 
10 sequence signature motifs: 

(a) acyl-CoA ligase (AL) motif I extending from about residue 226 to about residue i ■■■■ 
240, and motif n extending from about residue 486 to about residue 493; 

(b) 0-ketoacyl synthase 1 (KS1) motif I extending from about residue 897 to about 
residue 913, motif H extending from about residue 1038 to about residue 1047, and 

15 motif III extending from about residue 1080 to about residue 1089; 

(c) 0-ketoacyl synthase 2 (KS2) motif I extending from about residue 2777 to about 
residue 2793, motif II extending from about residue 2918 to about residue 2927, and 

nn>ti€ni,exteridm^ 

(d) 0-ketoacyl reductase (KR) motif extending from about residue 1812 to about 

20 residue 1842; 

(e) acyl carrier protein 1 (ACPI) motif extending from about residue 667 to about 

residue 678; 

(f) acyl carrier protein 2 (ACP2) motif extending from about residue 2484 to about 
residue 2495; 

25 (g) acyl carrier protein 3 (ACP3) motif extending from about residue 2568 to about 

residue 2579; 

(h) adenylation domain (A) motif I extending from about residue 3806 to about 
residue 3811, motif II extending from about residue 3851 to about residue 3861, motif 
m extending from about residue 3917 to about residue 3932; motif IV extending from 
30 about residue 3967 to about residue 3970, motif V extending from about residue 4063 to 
about residue 4069, motif VI extending from about residue 41 14 to about residue 4128, 
motif VH extending from about residue 4152 to about residue 4157, motif VDI 
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extending from about residue 4170 to about residue 4189, motif IX extending from 
about residue 4239 to about residue 4245, and motif X extending from about residue 
4259 to about residue 4264; 

(i) peptidyl carrier protein 1 (PCP1) motif extending from about residue 3261 to 
5 about residue 327 1 ; 

(j) peptidyl carrier protein 2 (PCP2) motif extending from about residue 4306 to 
about residue 43 1 6; 

(k) condensation domain 1 (CI) motif I extending from about residue 3333 to about 
residue 3342, motif II extending from about residue 3381 to about residue 3389, and 
10 motif III extending from about residue 3456 to about residue 3465, motif IV extending 
from about residue 3495 to about residue 3501, motif V extending from about residue 
3606 to about residue 3617, motif VI extending from about residue 3641 to about 
residue 3647, motif VII extending from about residue 3658 to about residue 3665; and 
(1) condensation domain 2 (C2) motif I extending from about residue 4374 to about 
15 residue 4383, motif II extending from about residue 4421 to about residue 4429, and 
motif HI extending from about residue 4498 to about residue 4507, motif IV extending 
from about residue 4538 to about residue 4544, motif V extending from about residue 
4649 to about residue 4659, motif VI extending from about residue 4685 to about 
' 'residue 4691, motif VII extending tfom about residue 4701 to about residue 4708. 

20 From the above signature motifs, it can be deduced that XabB commences with an 

AL domain (residues 1-629) followed by an ACP domain (ACPI, residues 630-731). In 
other PKS systems, an N-terminal AL is involved in activation and incorporation of 3,4- 
dihydroxycyclohexane carboxylic acid, 3-amino-5-hydroxy benzoic acid (AHB A), or long- 
chain fatty acid as a starter (Aparicio et al y 1996; Motamedi and Shafiee, 1998; Tang et 

25 al y 1998; Duitman et al> 1999). The second module in XabB contains a KS (residues 732- 
1 165), and a KR (residues 181 1-1971) upstream of two ACPs (residues 2457-2522, 2544- 
2613), but lacks any discernable AT domain (Figure 1). The third module contains a KS 
(residues 2630-3046) followed by a PCP (residues 3221-3307) at the start of the XabB 
NRPS region. 

30 Four other fused PKS/NRPS systems (Albertini et ai, 1995; Gehring et ai, 1998; 

Duitman et al y 1999; Paitan et a/., 1999) are known, three of which lack recognisable AT 
domains (Figure 6). Yersinia pestis HMWP1 contains a typical PKS elongation module 
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(including AT), and an NRPS module with a terminating TE domain. It is the third protein, 
following an AL (YbtE) and NRPS (HMWP2) in the biosynthetic apparatus for 
yersiniabactin (Gehring et al y 1998). B. subtilis MycA bears the closest resemblance to 
XabB, showing PKS initiation and elongation modules linked via an amino transferase 

5 (AMT) domain to the NRPS region. In B. subtilis PksK and M xanthus Tal, the NRPS 
region precedes the PKS region. Separate AT enzymes encoded elsewhere in the genome 
may operate in trans to load the appropriate acyl groups onto the ACPs in the elongation 
modules of these PKSs. Candidates are a malonyl-CoA tranascylase gene (fenF) located 
immediately upstream of my cA (Duitman et ai 9 1999), and an acyltransferase gene located 

10 20 kb upstream of tal (Paitan et aL> 1999). Accordingly, it is believed that one or more 
such frcwts-acting AT enzymes may also be involved in connection with the operation of 
XabB. 

From the characteristics of albicidin, and the architecture of the XabB PKS region 
(Figure 1), the inventor considers that: (i) the AL couples coenzyme A to a shikimate- 

15 derived acyl residue in an ATP-dependent reaction, and loads the activated acyl unit onto 
the 4'-phosphopantetheine prosthetic arm of ACPI; (ii) an acyl group is loaded onto ACP2 
or ACP3 by a separate acyltransferase; (iii) the KS1 domain accepts the acyl residue from 
ACPI onto i> cimservod cysteine resnjvi^ (hen transfer k by decnjlMjxyWvvt; co>«^l^:m>». . 
onto the acyl group tethered to ACP2 or ACP3; (iv) the tethered chain is modified by KR; 

20 (v) the assembled polyketide intermediate is translocated via KS2 onto the 4- 
phosphopantetheine prosthetic arm of PCP1, at the start of the NRPS region. 

The A domain in the NRPS region of XabB contains ten conserved sequences (Al 
to A10, Table 2) identified as AMP, ATP-Mg binding, adenine binding or ATPase sites 
(Turgay et al, 1992; Marahiel et al, 1997). In other NRPS systems, A domains select and 

25 load a particular amino acid, nonproteinogenic amino, hydroxyl or carboxy acid (Marahiel 
et al 9 1997). Substrate specificity is determined at the binding pocket, consisting of a 
stretch of about 100 amino acid residues between highly conserved motif A4 and A5 
(Conti et al y 1997). Sequence alignments for this region reveal some clusters 
corresponding with the loaded substrate (Stachelhaus et al 9 1999). The A do nam from 

30 XabB falls in a diverse cluster of NRPS modules involved in loading of His, Leu or 
aromatic amino acids (Phe and Tyr) in other NRPS systems (Figure 7). Bared on the 
architecture of the XabB NPRS region, it can be inferred that the polyketide intermediate 
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tethered on PCP1 is accepted by CI and coupled to the amino, hydroxyl, or carboxy acid 
preloaded by A onto PCP2. The final condensation domain at the C-terminus of XabB is 
probably involved in peptide-chain termination and cyclisation, as in enmatin, HC-toxin, 
rapamycin and FK506 systems (Konz and Marahiel, 1999). 

5 2.1.2 Phosphopantetheinyl transferase associated with albicidin biosynthesis 

The present invention also provides a gene (xabA) from X. albilineans encoding a 
phosphopantetheinyl transferase (PPTase) associated with XabB function. In this regard, 
XabB contains five carrier protein (ACP/PCP) domains, to which the growing polyketide 
or polypeptide chain could be covalently tethered. Each functional ACP or PCP domain 

10 must have a specific serine side chain phosphopantetheinylated by a dedicated PPTase 
(Lambalot et ai, 1996). The product of xabA (XabA) fulfils this function and is required 
for post-translational activation of synthetases in the albicidin biosynthetic pathway. 

The full-length amino acid sequence of this X. albilineans PPTase, presented in 
SEQ ED NO: 83, extends 278 residues and includes the sequence signature motifs for 

15 PPTases which are located as follows: (I) motif I spanning from about residue 159 to about 
residue 167; and (H) motif II spanning from about residue 207 to about residue 218, of 

SEQ ID NOT 83. ''Tfie^sequence mteivening between the two motifs extends from about 

residue 168 to about residue 206 of SEQ ID NO: 83. These conserved sequence motifs and 
the intervening sequence are presented for convenience in SEQ ID NO: 89, 93 and 91, 

20 respectively. 

The deduced xabA gene product has 56-62 % overall similarity to EntD proteins 
for enterobactin biosynthesis and 39-56 % overall similarity to other enzymes in the 
phosphopantetheinyl transferase superfamily. Like entD, xabA includes rarely used codons, 
which may impose post-transcriptional control on the rate of gene product formation 
25 (Coderre & Earhart, 1989). Codon optimisation of xabA may, therefore, be useful for 
enhancing the production of XabA 

2.1.3 Methyltransferase associated with albicidin biosynthesis 

The invention also provides a gene (xabQ from X. albilineans encoding a 
methyltransferase enzyme, more particularly an O-methyltransferase enzyme, which is 
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required for albicidin production and which when expressed above natural levels leads to 
increased levels and/or functional activities of albicidin antibiotics. The full-length amino 
acid sequence of this X. albilineans melhyltransferase, presented in SEQ ID NO: 95, 
extends 343 residues and includes methyltransferase consensus sequence motifs which are 
5 located as follows: (I) motif I spanning from about residue 173 to about residue 180; (H) 
motif n spanning from about residue 236 to about residue 243; and (HQ motif ffl spanning 
from about residue 266 to about residue 274, of SEQ ID NO: 95. These conserved 
sequence motifs are presented for convenience in SEQ ID NO: 99, 101 and 103, 
respectively. 

10 2 2 Biologjcallv active fragments 

The invention also contemplates biological fragments of the above polypeptides 
of at least 6 and preferably at least 8 amino acids in length, which comprise an activity 
associated with the domains described above. For example, biologically active fragments 
may be produced according to any suitable procedure known in the art. For example, a 
15 suitable method may include first producing a fragment of a parent polypeptide as 
described in Section 2.1 and then testing the fragment for the appropriate biological 
activity. In one embodiment, the fragment is derived from the albicidi* PKS -NP»S of the 
invention andls tested for an activity selected form the group consisting of acyl-CoA 
ligase activity, 0-ketoacyl synthase activity, 0-ketoacyl reductase, acyl carrier protein 
20 activity, adenylation activity, peptidyl carrier protein activity and condensation activity. 

Any assays mat detects or preferably measure such activities is contemplated in 
the practice of the present invention. The biologically active fragment suitably comprises 
any one or more of the sequence signature motifs described above, or variants thereof. 
Preferably, the biologically active fragment comprises all said sequence signature motifs, 
25 or variants thereof. 

In another embodiment, the fragment is derived from the PPTase of the invention 
and is tested for PPTase activity according to standard assays known to personr of skill in 
the art. Suitably, the PPTase catalyses the pantetheinylation, more preferably the 
phosphopantetheinylation, of proteins involved in antibiotic biosynthesis, preferably 
30 albicidin biosynthesis. The biologically active fragment preferably comprises the 
consensus sequence motifs set forth in SEQ ID NO: 89 and 93, or variant thereof and thus, 
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more preferably comprises the sequence from about residue 159 to about residue 218, of 
SEQIDNO: 83. 

In yet another embodiment, the fragment is derived from the methyltraisferase of 
the invention and is tested for methyltransferase activity, preferably O-methyltransferase 

5 activity and more preferably 5-adenosylmethionine-dependent O-methyltransferase 
activity. Suitably, the methyltransferase catalyses the transfer of one or more methyl 
groups to an antibiotic precursor, more preferably an albicidin precursor or an intermediate 
relating to the biosynthesis of albicidins. The biologically active fragment preferably 
comprises the consensus sequence motifs set forth in SEQ ID NO: 99, 101 and 103, or 

10 variant thereof and thus, more preferably comprises the sequence from about residue 173 
to about residue 274 of SEQ ID NO: 95 SEQ ID NO: 105), or variant of said 
sequence. In an especially preferred embodiment, the biologically active fragment 
comprises the sequence from about residue 1 to about residue 277 of SEQ ID NO: 95 (i.e., 
SEQ ID NO: 107), or variant of said sequence. An exemplary polynucleotide encoding this 

15 sequence is cloned in vector pLXABB described infra. 

Alternatively, biological activity of the fragment is tested by introducing a 
polynucleotide from which a fragment of a parent polypeptide can be translated into a cell, 
and detecting one or more of the above activities, which is indicative of said fragment 
being a biologically active fragment In one embodiment, such activity can be assayed by 
20 introducing into an albicidin deficient xabK X albilineans mutant strain LSI 57 
described herein) a polynucleotide from which a PKS-NRPS-associated fragment can be 
produced and assaying for antibiotic activity using a microbial plate assay, as for instance 
described in Example 1 . 

In another embodiment embodiment, PPTase activity is assayed by introducing 
25 into an albicidin deficient xabA' X albilineans mutant (e.g., strain LSI 56 described herein) 
a polynucleotide from which a PPTase-associated fragment can be produced and assaying 
for antibiotic activity using a microbial plate assay, as for instance described in Example 2. 

In yet another embodiment, methyltransferase activity is assayed by introducing 
into an albicidin deficient xabC X albilineans mutant (e.g., strain LS-JP1 described 
30 herein) a polynucleotide from which a methyltransferase-associated fragment can be 
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produced and assaying for antibiotic activity as for example described herein using a 
microbial plate assay, as for instance described in Example 3. 

2.3 Polypeptide variants 

The invention also contemplates polypeptide variants of the polypeptides of the 
5 invention wherein said variants have an activity selected form the group consisting of acyl- 
CoA ligase activity, 0-ketoacyl synthase activity, 0-ketoacyl reductase, acyl car_ier protein 
activity, adenylation activity, peptidyl carrier protein activity, condensation activity, 
PPTase activity, and methyltransferase activity. Suitable methods of producing polypeptide 
variants include, for example, producing a modified polypeptide whose sequence is 
10 distinguished from a parent polypeptide as described in Section 2.1 or a biologiudly active 
fragment thereof by the substitution, deletion and/or addition of at least one amino acid.; 
The modified polypeptide is then tested for one or more of said activities, wherein the 
presence of that activity indicates that the modified polypeptide is a variant of the parent 
polypeptide. 

15 In another embodiment, a polypeptide variant is produced by introducing into a 

cell a polynucleotide from which a modified polypeptide can be translated, and detecting 
one or more of the activities described above that are associated with the cell, which is 
indicative of the modified polypeptide being a polypeptide variant. 

In general, variants will have at least 60%, more suitably at least 70%, preferably 
20 at least 80%, and more preferably at least 90% homology to a polypeptide as for example 
shown in SEQ ID NO: 4, or a biological fragment thereof. It is preferred that variants 
display at least 60%, more suitably at least 70%, preferably at least 75%, more preferably 
at least 80%, more preferably at least 85%, more preferably at least 90% and still more 
preferably at least 95% sequence identity with a parent polypeptide as described in Section 
25 2.1 or a biologically active fragment thereof. In this respect, the window of comparison 
preferably spans about the full length of the polypeptide or of the biologically active 
fragment. Suitable variants can be obtained from any secondary metabolite-producing 
organism, and preferably fiom an albicidin-producing organism. 

Alternatively polypeptide variants according to the invention can be identified 
30 either rationally, or via established methods of mutagenesis (see, for example, Watson, J. 
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D. et al y "MOLECULAR BIOLOGY OF THE GENE*', Fourth Edition, 
Benjamin/Cummings, Menlo Park, Calif., 1987). Significantly, a random mutagenesis 
approach requires no a priori information about the gene sequence that is to he mutated. 
This approach has the advantage that it assesses the desirability of a particular mutant 

5 based on its function, and thus does not require an understanding of how or why the 
resultant mutant protein has adopted a particular conformation. Indeed, the random 
mutation of target gene sequences has been one approach used to obtain mutant proteins 
having desired characteristics (Leatherbarrow, R. 1986, J. Prot. Eng. 1: 7-16; Knowles, J. 
R., 1987, Science 236: 1252-1258; Shaw, W. V., 1987, Biochem. J. 246: 1-17; Gerit, J. A. 

10 1987, Chem. Rev, 87: 1079-1105). Alternatively, where a particular sequence alteration is 
desired, methods of site-directed mutagenesis can be employed. Thus, such methods may 
be used to selectively alter only those amino acids of the protein that are believed to be 1 
important (Craik, C. S., 1985, Science 228: 291-297; Cronin, et aL, 1988, Biochem. 27: 
4572-4579; Wilks, et al. f 1988, Science 242: 1541-1544). 

15 Variant peptides or polypeptides, resulting from rational or established methods of 

mutagenesis or from combinatorial chemistries may comprise conservative amino acid 
substitutions. Exemplary conservative substitutions in a polypeptide or polypeptide 

ftaginevst a^rdmg U* the invention may be- made voiding to &e.fo!io^ing'tabu. ■ .. . ... 

TABLES 



Original Residue 


Exemplary Substitutions 


Ala 


Ser 


Arg 


Lys 


Asn 


Gin, His 


Asp 


Glu 


Cys 


Ser 


Gin 


Asn 


Glu 


Asp 


Gly 


Pro 
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Original Residue 


Exemplary Substitutions . 


His 


Asn, Gin 


De 


Leu, Val 


Leu 


lie, Val 


Lys 


Arg, Gin, Glu 


Met 


Leu, De, 


Phe 


Met, Leu, Tyr 


Ser 


Thr 


Thr 


Ser 


Trp 


Tyr 


Tyr 


Trp, Phe 


Val 


De, Leu 



Substantial changes in function are made by selecting substitutions that are less 

conservative substitutions and relatively fewer of these may be tolerated. Generally, the 
5 substitutions which are likely to produce the greatest changes in a polypeptide's properties 
are those in which (a) a hydrophilic residue (e.g., Ser or Asn) is substituted for, or by, a 
hydrophobic residue (eg., Ala, Leu, lie, Phe or Val); (b) a cysteine or proline is substituted 
for, or by, any other residue; (c) a residue having an electropositive side chain (eg., Arg, 
His or Lys) is substituted for, or by, an electronegative residue (e.g., Glu or Asp) or (d) a 
10 residue having a smaller side chain (eg., Ala, Ser) or no side chain (eg., Gly) is 
substituted for, or by, one having a bulky side chain (eg., Phe or Trp). 

2.4 Polypeptide derivatives 

A polypeptide can typically tolerate one or more amino acid deletions and 
insertions in its amino acid sequence without loss or significant loss of a desired activity. 
15 Accordingly, the invention also contemplates derivatives of the parent polypeptides of the 
invention described in Section 2.1 or biologically active fragments thereof or "ariants of 
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these, which include amino acid deletions and/or additions, wherein said derivatives 
comprise one or more activities selected form the group consisting of acyl-CoA ligase 
activity, 0-ketoacyI synthase activity, 0-ketoacyl reductase, acyl carrier protein activity, 
adenylation activity, peptidyl carrier protein activity, condensation activity, PPTase 
5 activity and methyltransferase activity associated with antibiotic biosynthesis, and 
preferably with albicidin biosynthesis. 

Preferred derivatives of the invention include PKS-NRPS molecules with altered 
activities in one or more respects and thus produce polyketides other than the albicidin 
natural produces) of the XabB. A PKS-NRPS derived from XabB by such alteration 

10 includes a modular PKS-NRPS (or its corresponding encoding gene(s)) that retains the 
scaffolding of the utilised portion encoded by the naturally occurring gene. Not all domains 
or modules need be altered On the constant scaffold, at least one enzymatic activity is 
mutated, deleted, replaced, or inserted so as to alter the activity of the resulting PKS-NRPS 
relative to the original or parent PKS-NRPS. Alteration results when these activities are 

15 deleted or are replaced by a different version of the activity, or simply mutated in such a 
way that a polyketide other than the natural product results from these collective activities. 
This occurs because there has been a resulting alteration of the starter unit and/or 
t;lori£atU«Kv^ ***** length or cydisauim, an& ; c.i .^utU^.ju^v. 

dehydration cycle outcome at a corresponding position in the product polyketide. Where a 

20 deleted activity is replaced, the origin of the replacement activity may come from a 
corresponding activity in a different naturally occurring PKS or PKS-NRPS or from a 
different region of the albicidin PKS-NRPS. Any or all PKS/NRPS genes may be included 
in the derivative or portions of any of these may be included, but the scaffolding of the 
albicidin PKS-NRPS protein is preferably retained in whatever derivative is constructed. 

25 Thus, a PKS-NRPS derived from the albicidin PKS-NRPS includes a tKS-NRPS 

that contains the scaffolding of all or a portion of XabB. The derived PKS-NRPS also 
contains at least two elongation modules that are functional and preferably at )east three 
elongation modules. The derived PKS-NRPS also contains mutations, deletions, insertions, 
or replacements of one or more of the activities of the functional domains or modules of 

30 XabB so that the nature of the resulting polyketide is altered. Exemplary embodiments 
include those wherein a KS or ACP domain has been deleted or replaced by a version of 
the activity from a different PKS/NRPS or from another location within XabB. Also 
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contemplated are derivatives where at least one non-condensation cycle enzymatic activity 
(KR, KR, or A) has been deleted or added or wherein any of these activities has been 
mutated so as to change the structure of the polyketide synthesised by the PKS. 

Other derivatives contemplated by the present invention include fusion of the 
5 polypeptides, fragments and polypeptide variants of the invention with other polypeptides 
or proteins. For example, it will be appreciated that said polypeptides, fragments or 
variants may be incorporated into larger polypeptides, and that such larger polypeptides 
may also be expected to have one or more of the activities mentioned above. The 
polypeptides, fragments or variants of the invention may be fused to a further protein, for 
10 example, which is not derived from the original host The further protein may assist in the 
purification of the fusion protein. For instance, a polyhistidine tag or a maltose binding 
protein may be used in this respect as described in more detail below. Other possible fusion 
proteins are those which produce an immunomodulatory response. Particular examples of 
such proteins include Protein A or glutathione S-transferase (GST). 

15 Other derivatives contemplated by the invention include, but are not limited to, 

modification to side chains, incorporation of unnatural amino acids and/or their derivatives 
during peptide, polypeptide or protein synthesis and the use of crosslinkers and other 
methods which impose "confrirmationai constraints on the polypeptides, fragments and 
variants of the invention. Examples of side chain modifications contempteted by the 
20 present invention include modifications of amino groups such as by acylation with acetic 
anhydride; acylation of amino groups with succinic anhydride and tetrahydrophtnalic 
anhydride; amidination with memylacetimidate; carbamoylation of amino groups with 
cyanate; pyridoxylation of lysine with pyridoxal-5-phosphate followed by reduction with 
NaBILf, reductive alkylation by reaction with an aldehyde followed by reduction with 
25 NaBftt; and trinitrobenzylation of amino groups with 2, 4, 6-trinitrobenzene sulphonic acid 
(TNBS). The carboxyl group may be modified by carbodiimide activation via O- 
acylisourea formation followed by subsequent derivatisation, by way of example, to a 
corresponding amide. The guanidine group of arginine residues may be modified by 
formation of heterocyclic condensation products with reagents such as 2,3-butanedione, 
30 phenylglyoxal and glyoxal. Sulphydryl groups may be modified by methods such as 
performic acid oxidation to cysteic acid; formation of mercurial derivatives using 4- 
chloromercuriphenylsulphonic acid, 4-chloromercuribenzoate; 2-cWoromercuri-4- 
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nitrophenol, phenylmercury chloride, and other mercurials; formation of a mixed 
disulphides with other thiol compounds; reaction with maleimide, maleic anhydride or 
other substituted maleimide; caiboxymethylatibn with iodoacetic acid or iodoacetamide; 
and carbamoylation with cyanate at alkaline pH. Tryptophan residues may be modified, for 
5 example, by alkylation of the indole ring with 2-hydroxy-5-nitrobenzyl bromide or 
sulphonyl halides or by oxidation with N-bromosuccinimide. Tyrosine residues may be 
modified by nitration with tetranitromethane to form a 3-nitrotyrosine derivative. The 
imidazole ring of a histidine residue may be modified by N-caibethoxylation with 
diethylpyrocaibonate or by alkylation with iodoacetic acid derivatives. 

10 Examples of incorporating unnatural amino acids and derivatives during peptide 

synthesis include but are not limited to, use of 4-amino butyric acid, 6-aminohexanoic acid, 
4-amino-3-hydroxy-5-phenylpentanoic acid, 4-amino-3-hydroxy-6-methylheptanoic acid, 
t-butyiglycine, norleucine, norvaline, phenylglycine, ornithine, sarcosine, 2-thienyl alanine 
and/or D-isomers of amino acids. A list of unnatural amino acids contemplated by the 

1 5 present invention is shown in TABLE C. 

TABLE C 



r 



Non-conventional amino acid 


Non-conventional amino acid 


of-aminobutyric acid 


L-N-methylalanine 


a-amino-of-methylbutyrate 


L-N-methylarginine 


aminocyclopropane-carboxylate 


L-N-methylasparagine 


aminoisobutyric acid 


L-N-methylaspartic acid 


aminonorbornyl-carboxylate 


L-N-methylcysteine 


cyclohexylalanine 


L-N-methylglutamine 


cyclopentylalanine 


L-N-methylglutamic acid 


L-N-methylisoleucine 


L-N-methylhistidine 


D-alanine 


L-N-methylleucine 


D-arginine 


L-N-methyllysine 


D-aspartic acid 


L-N-methylmethionine 
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Non-conventional amino acid 


Non-conventional amino acid 


D-cysteine 


L-N-methylnorleucine 


D-glutamate 


L-N-methylnorvaline 


D-glutamic acid 


L-N-methylornithine 


D-histidine 


L-N-methylphenylalanine 


D-isoleucine 


L-N-methylproIine 


D-leucine 


L-N-medlylserine 


D-lysine 


L-N-methylthreonine 


D-methionine 


L-N-methyltryptophan j 


D-onrithine 


L-N-methyltyrosine 


D-phenylalanine 


L-N-methylvaline 


D-proline 


L-N-methylethylgJycine 


D-serine 


L-N-methyl-t-butylglycine 


D-threonine 


L-norleucine 


D-fcyptophaii 


L-norvalinc 


D-tyrosine 


cc-methyl-aminoisobutyrate 


D-valine 


cemethyl-7-aminobutyrate 


D-oe-methylalanine 


a-methylcyclohexylalanine 


I>^methylarginine 


cwnethylcylcopentylalanine 


D-oe-methylasparagine 


a-methyl-05-napthylalanine 


D-a-methylaspartate 


oc-methylpenicillamine 


D-a-methylcysteine 


N-(4-aminobutyl)glycine 


D-a-methylglutamine 


N-(2-aminoethyl)glycine 


D-a-methylhistidine 


N-(3-aminopropyl)glycine 


D-cemethylisoleucine 


N-amino-OHnethylbutyrate 


D-oc-methylleucine 


a-napthylalanine 
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Non-conventional amino acid 


Non-conventional amino acid 


D-os-methyllysine 


N-benzylglycine 


D-a-methylmethionine 


N-(2-carbamylediyl)glycine 


D-of-methylomithiine 


N-(cart>amylmethyl)glycine 


D-a-methylphenylalanine 


N-(2-caiboxyethyl)glycine 


D-oc-methylproline 


N-(carboxymethyl)glycine 


D-a-methylserine 


N-cyclobutylglycine 


D-ce-methylthreonine 


N-cycloheptylglycine 


D-a-methyltryptophan 


N-cyclohexylglycine 


D-of-methyltyrosine 


N-cyclodecylglycine 


I^ownethylleucine 


L-Of-methyllysine 


L-a-methylmethionine 


L-oc-methylnorieucine 


L-a-methylnorvatine 


L-a-methylorni thine 


L-a^mefhylphenylalanine 


L-Qf-methylproline 






L-o-methyltryptophan 


L-oc-methyltyrosine 


L-oj-methylvaline 


L-N-methylhomophenylalanine 


N-(N-(2,2-diphenylethyl 
caibamylmethyl)glycine 


N-(N-(3,3-diphenylpn>pyl 
carbamylmethyl)glycine 


l-cart>oxy-l-(2,2-diphenyl-ethyl 
amino)cyclopropane 





Also contemplated is the use of crosslinkers, for example, to stabilise 3D 
conformations of the polypeptides, fragments or variants of the invention, using homo- 
Afunctional cross linkers such as bifiinctional imido esters having (CH 2 )n spacer groups 
5 with n = 1 to n = 6, glutaraldehyde, N-hydroxysuccinimide esters and hetero-bifunctional 
reagents which usually contain an amino-reactive moiety such as N-hydroxysuccinimide 
and another group specific-reactive moiety such as maleimido or dithio moiety or 
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carbodiimide. In addition, peptides can be conformational^ constrained, for example, by 
introduction of double bonds between C a and Qj atoms of amino acids, by incorporation of 
C a and Ncrmethylamino acids, and by formation of cyclic peptides or analogues by 
introducing covalent bonds such as forming an amide bond between the N and C termini 

5 between two side chains or between a side chain and the N or C terminus of the peptides or 
analogues. For example, reference may be made to: Marlowe (1993, Biorganic & 
Medicinal Chemistry Letters 3: 437-44) who describes peptide cyciisation on TFA resin 
using trimethylsilyl (TMSE) ester as an orthogonal protecting group; Pallin and Tarn 
(1995, J. Chem. Soc. Chem. Comm. 2021-2022) who describe the cyciisation of 

10 unprotected peptides in aqueous solution by oxime formation; Algin et al (1994, 
Tetrahedron Letters 35: 9633-9636) who disclose solid-phase synthesis of head-to-tail 
cyclic peptides via lysine side-chain anchoring; Kates et al (1993, Tetrahedron Letters 34: 
1549-1552) who describe the production of head-to-tail cyclic peptides by three- 
dimensional solid phase strategy; Tumelty et al (1994, J. Chem. Soc. Chem. Comm. 1067- 

15 1068) who describe the synthesis of cyclic peptides from an immobilised activated 
intermediate, wherein activation of the immobilised peptide is carried out with N- 
protecting group intact and subsequent removal leading to cyciisation; McM'irray et al 
(1994, Peptide Research 7: 195-206) who disclose head-to-tail cyciisation of peptides 
^iS&dSS£ to'insoluble siqipbrts by means of the side chains of aspartic and glutamic acid; 

20 Hruby et al (1994, Reactive Polymers 22: 231-241) who teach an alternate method for 
cyclising peptides via solid supports; and Schmidt and Langer (1997, J. Peptide Res. 49: 
67-73) who disclose a method for synthesising cyclotetrapeptides and cyclopentapeptides. 
The foregoing methods may be used to produce conformationally constrained polypeptides 
that comprise one or more activities selected form the group consisting of acyl-CoA ligase 

25 activity, /3-ketoacyl synthase activity, jS-ketoacyl reductase, acyl carrier protem activity, 
adenylation activity, peptidyl carrier protein activity, condensation activity, PPTase 
activity and methyltransferase activity associated with the production of polyketides and 
particularly albicidins or analogues thereof. 

The invention also contemplates polypeptides, fragments or variants of the 
30 invention that have been modified using ordinary molecular biological techniques so as to 
improve their resistance to proteolytic degradation or to optimise solubility properties or to 
render them more suitable as an immunogenic agent. 
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3. Polynucleotides of the invention 

3.1 Polynucleotides encoding polypeptides of the i nvention 

3.1.1 Aibicidin synthetase-encoding polynucleotides 

The invention further provides a polynucleotide that encodes a PKS-NRPS 
5 polypeptide of the invention, or biologically active fragment thereof, or /ariant or 
derivative of these as defined above, ha one embodiment, the polynucleotide comprises the 
entire sequence of nucleotides set forth in SEQ ID NO: 1. SEQ ID NO: 1 corresponds to a 
1651 1-bp X. albilineans xabB cistron. SEQ ID NO: 3, defines the full-length coding 
sequence of xabB and encodes various sequence signature motifs at the following 
10 nucleotide positions: 

(a) acyl-CoA ligase (AL) motif I from about nucleotide 676 to about nucleotide 720, 
and motif EL from about nucleotide 1456 to about nucleotide 1477; 

(b) 0-ketoacyl synthase 1 (KS1) motif I from about nucleotide 2689 to about 
nucleotide 2739, motif H from about nucleotide 3112 to about nucleotide 3141, and 

15 motif III from about nucleotide 3238 to about nucleotide 3267; 

(c) 0-ketoacyl synthase 2 (KS2) motif I from about nucleotide 8329 to about 
imetattine .8379, oM»f,E from »*»..«* Kudeotide 8752 to about nucleotide .08.1; and 
motif HI from about nucleotide 8863 to about nucleotide 8892; 

(d) /3-ketoacyl reductase (KR) motif from about nucleotide 5434 to about nucleotide 
20 5526; 

(e) acyl carrier protein 1 (ACPI) motif from about nucleotide 1999 to about 
nucleotide 2034; 

(f) acyl carrier protein 2 (ACP2) motif from about nucleotide 7450 to about 
nucleotide 7485; 

25 (g) acyl carrier protein 3 (ACP3) motif from about nucleotide 7702 to about 

nucleotide 7735; 

(h) adenylation domain (A) motif I from about nucleotide 11416 to about nucleotide 
11433, motif II from about nucleotide 11551 to about nucleotide 11583, motif EI from 
about nucleotide 11749 to about nucleotide 11796; motif IV from about aucleotide 
30 11899 to about nucleotide 11910, motif V from about nucleotide 12187 to about 
nucleotide 12207, motif VI from about nucleotide 12340 to about nucleotide 12384, 
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motif VII from about nucleotide 12454 to about nucleotide 12471, motif Vm from 
about nucleotide 12508 to about nucleotide 12567, motif DC from about nucleotide 
12715 to about nucleotide 12735, and motif X from about nucleotide 127 /5 to about 
nucleotide 12792; 

5 (i) peptidyl carrier protein 1 (PCP1) motif from about nucleotide 9781 to about 

nucleotide 9813; 

(j) peptidyl carrier protein 2 (PCP2) motif from about nucleotide 129*5 to about 
nucleotide 12948; 

(k) condensation domain 1 (CI) motif I from about nucleotide 9997 to about 
10 nucleotide 10026, motif II from about nucleotide 10141 to about nucleotide 10167, and 
motif HI from about nucleotide 10366 to about nucleotide 10395, motif IV from about 
nucleotide 10483 to about nucleotide 10503, motif V from about nucleotide 10816 to 
about nucleotide 10851, motif VI from about nucleotide 10921 to about nucleotide 
10941, motif VII from about nucleotide 10972 to about nucleotide 10995; and 
15 (1) condensation domain 2 (C2) motif I from about nucleotide 1312C to about 

nucleotide 13149, motif II from about nucleotide 13261 to about nucleotide 13287, and 
motif EH from about nucleotide 13492 to about nucleotide 13521, motif IV from about 
nucleotide 13612 to about nucleotide 13632, motif V from about nucleotide 13945 to 
^out nucleotide V i3077r motif VI from about nucleotide 14053 to about nucleotide 
20 14073, motif VH from about nucleotide 14101 to about nucleotide 14124. 

Those of skill in the art will recognise that, due to the degenerate nature of the 
genetic code, a variety of polynucleotides differing in their nucleotide sequences can be 
used to encode a given amino acid sequence of the invention. The native polynucleotide 
sequence encoding the PKS-NRPS of X. albilineans is shown herein merely to illustrate a 
25 preferred embodiment of the invention, and the invention includes polynucleotides of any 
sequence that encode the amino acid sequences of the polypeptides and proteins of the 
invention. 

3.L2 PPTase-encoding polynucleotides 

The invention further provides a polynucleotide that encodes a PPTase 
30 polypeptide of the invention, or biologically active fragment thereof; or variant or 
derivative of these as defined above. In one embodiment, the polynucleotide comprises the 
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entire sequence of nucleotides set forth in SEQ ID NO: 82. SEQ ID NO: 82 corresponds to 
a 1200-bp X. albilineans xabA cistron. This sequence encodes a PPTase catalytic domain 
from about nucleotide 475 to about nucleotide 654. This domain comprises two conserved 
PPTase sequence motifs: (I) motif I encoded by a nucleotide sequence from about 

5 nucleotide 475 to about nucleotide 501; and (H) motif H encoded by a nucleotide sequence 
from about nucleotide 619 to about nucleotide 654, of SEQ ID NO: 82. The intervening 
amino acid sequence, linking motifs 1 and H, is encoded by a nucleotide sequence from 
about nucleotide 502 to about nucleotide 618 of SEQ ID NO: 82. The said nucleotide 
sequences are presented for convenience in SEQ ID NO: 86, 88, 92 and 90, respectively. 

10 Suitably, the polynucleotide comprises the sequence set forth in SEQ ID NO: 84, which 
defines the full-length coding sequence of xabA. Alternatively, the polynucleotide 
comprises a contiguous sequence of nucleotides contained within the sequence set forth in 
SEQ ID NO: 86, which encodes the PPTase catalytic domain. 

3.1.3 Methyltransferase-encoding polynucleotides 

15 The invention further provides a polynucleotide that encodes a methyltransferase 

polypeptide of the invention, or biologically active fragment thereof, or variant or 
derivative of these as defined above In one embodiment, the polynucleotide comprises the 
entire sequence of nucleotides set forth in SEQ ID NO: 94. SEQ ID NO: 94 corresponds to 
a 1515-bp X albilineans xabC cistron. This sequence encodes three conserved 

20 methyltransferase sequence motifs: (I) motif I encoded by a nucleotide sequence from 
about nucleotide 565 to about nucleotide 585; (H) motif H encoded by a nucleotide 
sequence from about nucleotide 741 to about nucleotide 774; and (IH) motif m encoded by 
a nucleotide sequence from about nucleotide 841 to about nucleotide 867, or SEQ ID NO: 
94. The said nucleotide sequences are presented for convenience in SEQ ID NO: 98, 100 

25 and 102, respectively. Suitably, the polynucleotide comprises the sequence set forth in 
SEQ ID NO: 96, which defines the full-length coding sequence of xabC. Alternatively, the 
polynucleotide comprises a contiguous sequence of nucleotides contained within the 
sequence set forth in SEQ ID NO: 104 or 106, which encode biologically active fragments 
as described in Section 2.2. 
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3.2 Polynucleotide variants 

In general, polynucleotide variants according to the invention comprise regions 
that show at least 60%, more suitably at least 70%, preferably at least 80%, and more 
preferably at least 90% sequence identity over a reference polynucleotide sequence of 

5 identical size ^comparison window") or when compared to an aligned sequence in which 
the alignment is performed by a computer homology program known in the art. What 
constitutes suitable variants may be determined by conventional techniques. For example, 
a polynucleotide comprising at least one sequence selected from the group consisting of 
SEQ ID NO: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 

10 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 82, 84, 86, 88, 90, 92, 
94, 96, 98, 100, 102 and 104 can be altered using any suitable method including 
conventional recombinant techniques and mutagenesis methods such as' random 
mutagenesis (e.g. 9 transposon mutagenesis), oligonucleotide-mediated (or site-directed) 
mutagenesis, PCR mutagenesis and cassette mutagenesis of an earlier prepared variant or 

1 5 non-variant version of an isolated polynucleotide of the invention. 

Alternatively, polynucleotide sequences variants encoding heterologous 
PKS/NRPS enzymes for producing PKS-NRPS variants of the invention may b3 obtained 
from otfter secondary metabolite- or poiyketide-prcducirr£ orgatiiris. For example, suoh 
variants may be prepared according to the following procedure: 
20 (a) creating primers which are optionally degenerate wherein each comprises a 

portion of a reference polynucleotide encoding a reference polypeptide or fragment of 
the invention, preferably encoding at least one sequence selected from the group 
consisting of SEQ ID NO: 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 
38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 83, 
25 87, 89, 91, 93, 95, 99, 101, 103, 105 and 107; 

(b) obtaining a nucleic acid extract from a secondary metabolite-producing 
organism, which is preferably a bacterium, more preferably from a species of the family 
Pseudomonadaceae, more preferably from zXanthomonas species; and 

(c) using said primers to amplify, via nucleic acid amplification tecnniques, at 
30 least one amplification product from said nucleic acid extract, wherein said 

amplification product corresponds to a polynucleotide variant 
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Suitable nucleic acid amplification techniques are well known to the skilled 
addressee, and include polymerase chain reaction (PCR) as for example described in 
Ausubel et al {supra); strand displacement amplification (SDA) as for example described 
in U.S. Patent No 5,422,252; rolling circle replication (RCR) as for example described in 
5 Liu et al., (1996, J. Am. Chem. Soc. 118:1587-1594 and International application WO 
92/01813) and Lizardi et al., (International Application WO 97/19193); nucleic acid 
sequence-based amplification (NASBA) as for example described by Sooknanan et al, 
(1994, Biotechniques 17:1077-1080); and Q-B replicase amplification as for example 
described by Tyagi et al, (1996, Proc. Natl Acad. Sci. USA 93: 5395-5400). 

10 Typically, polynucleotide variants that are substantially complementary to a 

reference polynucleotide are identified by blotting techniques that include a step whereby 
nucleic acids are immobilised on a matrix (preferably a synthetic membrane such as 
nitrocellulose), followed by a hybridisation step, and a detection step. Southern blotting is 
used to identify a complementary DNA sequence; northern blotting is used to identify a 

15 complementary RNA sequence. Dot blotting and slot blotting can be used to identify 
complementary DNA/DNA, DNA/RNA or RNA/RNA polynucleotide sequences. Such 
techniques are well known by those skilled in the art, and have been described in Ausubel 
„ . ei «k£LS?>1293 s supn$& pages 2.9J^lirough 2.9.20. _ . ■ 

According to such methods, Southern blotting involves separating DNA 
20 molecules according to size by gel electrophoresis, transferring the size-separated DNA to 
a synthetic membrane, and hybridising the membrane-bound DNA to a complementary 
nucleotide sequence labelled radioactively, enzymatically or fluorochromaticslly. In dot 
blotting and slot blotting, DNA samples are directly applied to a synthetic membrane prior 
to hybridisation as above. An alternative blotting step is used when identifying 
25 complementary polynucleotides in a cDNA or genomic DNA library, such as through the 
process of plaque or colony hybridisation. A typical example of this procedure i« described 
in Sambrook et al ("Molecular Cloning. A Laboratory Manual", Cold Spring Harbour 
Press, 1989) Chapters 8-12. 

Typically, the following general procedure can be used to determine hybridisation 
30 conditions. Polynucleotides are blotted/transferred to a synthetic membrane, as described 
above. A reference polynucleotide such as a polynucleotide of the invention is labelled as 
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described above, and the ability of this labelled polynucleotide to hybridise with an 
immobilised polynucleotide is analysed. A skilled addressee will recognise that a number 
of factors influence hybridisation. The specific activity of radioactively labelled 
polynucleotide sequence should typically be greater than or equal to about 10 8 dpm/mg to 

5 provide a detectable signal. A radiolabelled nucleotide sequence of specific activity 10 to 
10 9 dpm/mg can detect approximately 0.5 pg of DNA. It is well known in tiae art that 
sufficient DNA must be immobilised on the membrane to permit detection. It is desirable 
to have excess immobilised DNA, usually 10 fig. Adding an inert polymer such as 10% 
(w/v) dextran sulfate (MW 500,000) or polyethylene glycol 6000 during hybridisation can 

10 also increase the sensitivity of hybridisation (see Ausubel supra at 2. 1 0. 1 0). 

To achieve meaningful results from hybridisation between a polynucleotide 
immobilised on a membrane and a labelled polynucleotide, a sufficient amount of the 
labelled polynucleotide must be hybridised to the immobilised polynucleotide following 
washing. Washing ensures that the labelled polynucleotide is hybridised only to the 

15 immobilised polynucleotide with a desired degree of complementarity to the labelled 
polynucleotide. It will be understood that polynucleotide variants according to the 
invention will hybridise to a reference polynucleotide under at least low stringency 
Conuiuoo» v RefcrjtriCc lumui in low rtringency conditions include ikiid encuui|jass fans* st- 
least about 1% v/v to at least about 15% v/v formamide and from at least abou; 1 M to at 

20 least about 2 M salt for hybridisation at 42° C, and at least about 1 M to at least about 2 M 
salt for washing at 42° C. Low stringency conditions also may include 1% Bovine Serum 
Albumin (BSA), 1 mM EDTA, 0.5 M NaHP0 4 (pH 7.2), 7% SDS for hybridisation at 
65° C, and (i) 2xSSC, 0.1% SDS; or (ii) 0.5% BSA, 1 mM EDTA, 40 mM NaHP0 4 (pH 
7.2), 5% SDS for washing at room temperature. 

25 Suitably, the polynucleotide variants hybridise to a reference polynucleotide under 

at least medium stringency conditions. Medium stringency conditions include and 
encompass from at least about 16% v/v to at least about 30% v/v formamide and from at 
least about 0.5 M to at least about 0.9 M salt for hybridisation at 42° C, and at least about 
0.1 M to at least about 0.2 M salt for washing at 55° C. Medium stringency conditions also 

30 may include 1% Bovine Serum Albumin (BSA), 1 mM EDTA, 0.5 M NaHP0 4 (pH 7.2), 
7% SDS for hybridisation at 65° C, and (i) 2 x SSC, 0.1% SDS; or (ii) 0.5% BSA, 1 mM 
EDTA, 40 mM NaHPQ 4 (pH 7.2), 5% SDS for washing at 60-65° C. 
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Preferably, the polynucleotide variants hybridise to a reference polynucleotide 
under high stringency conditions. High stringency conditions include and encompass from 
at least about 31% v/v to at least about 50% v/v formamide and from about 0.01 M to 
about 0.15 M salt for hybridisation at 42° C, and about 0.01 M to about 0.02 M salt for 
5 washing at 55° C. High stringency conditions also may include 1% BSA, 1 mM EDTA, 0.5 
M NaHP0 4 (pH 7.2), 7% SDS for hybridisation at 65° C, and (i) 0.2 x SSC, 0.1 % SDS; or 
(ii) 0.5% BSA, ImM EDTA, 40 mM NaHP0 4 (pH 7.2), 1% SDS for washing at a 
temperature in excess of 65° C. 

Other stringent conditions are well known in the art. A skilled addressee will 
10 recognise that various factors can be manipulated to optimise the specificity of the 
hybridisation. Optimisation of the stringency of the final washes can serve to ensure a high 
degree of hybridisation. For detailed examples, see Ausubel et aL, supra at pages 2.10.1 to 
2.10.16 and Sambrooke/ a/. (1989, supra) at sections 1.101 to 1.104. 

While stringent washes are typically carried out at temperatures from about 42° C 
15 to 68° C, one skilled in the art will appreciate that other temperatures may be suitable for 
stringent conditions. Maximum hybridisation rate typically occurs at about 20° C to 25° C 
below the T„ for formation of a DNA-DNA hybrid. It is well known in the art that the T m 
is the melting temperature, or temperature at which two complementary polynucleotide 
sequences dissociate. Methods for estimating T m are well known in the art (see Ausubel et 
20 al y supra at page 2. 10.8). 

In general, the T m of a perfectly matched duplex of DNA may be predicted as an 
approximation by the formula: 

T m = 81.5 + 16.6 (logio M) + 0.41 (%(HQ - 0.63 (% formamide) - (600/length) 

wherein: M is the concentration of Na + , preferably in the range of 0.01 molar to 
25 0.4 molar, %G+C is the sum of guanosine and cytosine bases as a percentage of the total 
number of bases, within the range between 30% and 75% G+C; % formamide is the 
percent formamide concentration by volume; length is the number of base pairs in the 
DNA duplex. 
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The T m of a duplex DNA decreases by approximately 1° C with every increase of 
1% in the number of randomly mismatched base pairs. Washing is generally carried out at 
T m - 1 5° C for high stringency, or T m - 30° C for moderate stringency. 

In a preferred hybridisation procedure, a membrane (e.g.; a nitrocellulose 
5 membrane or a nylon membrane) containing immobilised DNA is hybridised overnight at 
42° C in a hybridisation buffer (50% deionised foimamide, SxSSC, 5x Denhardt's solution 
(0.1% ficoll, 0.1% polyvinylpyrollidone and 0.1% bovine serum albumin), 0.1% SDS and 
200 mg/mL denatured salmon sperm DNA) containing labelled probe. The membrane is 
thai subjected to two sequential medium stringency washes (z.e., 2xSSC, 0.1% SDS for 15 
10 min at 45° C, followed by 2xSSC, 0.1% SDS for 15 min at 50° C), followed by two 
sequential higher stringency washes (i.e., 0.2xSSC, 0.1% SDS for 12 min at 55° C 
followed by 0.2xSSC and 0.1%SDS solution for 12 min at 65-68° C. 

Methods for detecting a labelled polynucleotide hybridised to an immobilised 
polynucleotide are well known to practitioners in the art. Such methods include 
15 autoradiography, phosphorimaging, and chemiluminescent, fluorescent and colorimetric 
detection. 

4. Expression vectors 

The present invention further provides expression vectors designed for genetic 
transformation of cells, preferably prokaryotic cells, comprising a polynucleotide, fragment 
20 or variant according to the invention operably linked to a regulatory polynucleotide. An 
expression vector is typically a nucleic acid that can be introduced into a host cell or cell- 
free transcription and translation system. An expression vector can be maintained 
permanently or transiently in a cell, whether as part of the chromosomal or other DNA in 
the cell or in any cellular compartment, such as a replicating vector in the cytoplasm. 

25 The various components of an expression vector can vary widely, depending on 

the intended use of the vector and especially the host cell(s) in which the vector is intended 
to replicate or drive expression. For example, the regulatory polynucleotide, which is used 
to control expression of a polynucleotide of the invention, will generally be appropriate for 
the host cell used for expression. Numerous types of appropriate expression vectors and 

30 suitable regulatory sequences are known in the art for a variety of host cells. Typically, the 
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regulatory polynucleotide includes, but is not limited to, promoter sequences, leader or 
signal sequences, ribosomal binding sites, transcriptional start and stop sequences, 
translational start and termination sequences, and enhancer or activator sequences. 
Constitutive or inducible promoters as known in the art are contemplated by the invention. 
5 The promoters may be either naturally occurring promoters, or hybrid promoters that 
combine elements of more than one promoter. 

In a preferred embodiment, the expression vector is operable in a Gram-negative 
prokaryotic cell. A variety of prokaryotic expression vectors, which may be used as a basis 
for constructing the expression vector of the invention. These include but are not limited to 
10 a chromosomal vector (e.g., a bacteriophage such as bacteriophage X), an 
extrachromosomal vector (e.g., a plasmid or a cosmid expression vector). The expression 
vector will also typically contain an origin of replication, which allows autonomous 
replication of the vector, and one or more selectable marker genes that allow phenotypic 
selection of the transformed cells. 

15 The expression vector may also include a fusion partner (typically provided by the 

expression vector) so that a recombinant polypeptide is expressed as a fusion polypeptide 
with said fusion partner. The main advantage of fusion partners is that they assist 
identification and/or purification of said fusion polypeptide. In order to express said fusion 
polypeptide, it is necessary to ligate a polynucleotide according to the invention into the 
20 expression vector so that the translational reading frames of the fusion partner and the 
polynucleotide coincide. Well known examples of fusion partners include, but are not 
limited to, glutathione-S-transferase (GST), Fc potion of human IgG, maltose binding 
protein (MBP) and hexahistidine (fflS 6 ), which are particularly useful for isolation of the 
fusion polypeptide by affinity chromatography. For the purposes of fusion polypeptide 
25 purification by affinity chromatography, relevant matrices for affinity chromatography are 
glutathione-, amylose-, and nickel- or cobalt-conjugated resins respectively. Many such 
matrices are available in "kit" form, such as the QIAexpress™ system (Qiagen) useful with 
(fflS 6 ) fusion partners and the Pharmacia GST purification system. In a preferred 
embodiment, the recombinant polynucleotide is expressed in the commercial vector 
30 pFLAG as described more fully hereinafter. Another fusion partner well known in the art is 
green fluorescent protein (GFP). This fusion partner serves as a fluorescent "tag" which 
allows the fusion polypeptide of the invention to be identified by fluorescence microscopy 
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or by flow cytometry. The GFP tag is useful when assessing subcellular localisation of the 
fusion polypeptide of the invention, or for isolating cells which express the fusion 
polypeptide of the invention. Flow cytometric methods such as fluorescence activated cell 
sorting (FACS) are particularly useful in this latter application. Preferably, the fusion 

5 partners also have protease cleavage sites, such as for Factor Xa or Thrombin, vhich allow 
the relevant protease to partially digest the fusion polypeptide of the invention and thereby 
liberate the recombinant polypeptide of the invention therefrom. The liberated polypeptide 
can then be isolated from the fusion partner by subsequent chromatographic separation. 
Fusion partners according to the invention also include within their scope "eprtope tags", 

10 which are usually short peptide sequences for which a specific antibody is available. Well 
known examples of epitope tags for which specific monoclonal antibodies are readily 
available include c-Myc, influenza vinis, haemagglutinin and FLAG tags. : V ! 

Preferred host cells for purposes of selecting vector components for expression 
vectors of the present invention include fungal host cells such as yeast and proteryotic host 
15 cells such as E. colt and X. albilineans, but mammalian cell cultures can also be used In 
hosts such as yeasts, plants, or mammalian cells that ordinarily do not produce modular 
polyketide synthase enzymes, it may be necessary to provide, also typically by 
recombinant means, suitable nolo- ACT syatlioScs to ixmvut lite recoawin^I: ; produced 
PKS to functionality. 

20 The expression vector may be used to transform the desired host cell to produce a 

recombinant host cell for producing inter alia a recombinant polypeptide or polyketides, 
particularly albicidins or analogues thereof, as described hereinafter. 

5. Methods of preparing the polypeptides of the invention 

Polypeptides of the inventions, including the full-length parent polypeptides 
25 described in Section 2.1, or their biologically active fragments comprising, for example 
one or more domains (or fragments of such domains), or variants or derivatives of these, 
may be prepared by any suitable procedure known to those of skill in the art. For example, 
the polypeptides may be prepared by a procedure including the steps of: - 

(a) preparing a recombinant polynucleotide comprising a nucleotide sequence 
30 encoding a polypeptide comprising the sequence set forth in any one of SEQ ID NO: 4 
or a fragment thereof comprising at least one sequence selected from the group 
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consisting of SEQ ID NO: 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 
38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 83, 
87, 89, 91, 93, 95, 99, 101, 103, 105 and 107, or variant or derivative of these, which 
nucleotide sequence is operably linked to a regulatory polynucleotide; 
5 (b) introducing the recombinant polynucleotide into a suitable host cell; 

(c) culturing the host cell to express recombinant polypeptide from said 
recombinant polynucleotide; and 

(d) isolating the recombinant polypeptide. 

Suitably, said nucleotide sequence comprises at least one sequence selected from 
10 the group consisting of SEQ ID NO: 1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 
33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 
82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102 and 104. 

The recombinant polynucleotide is preferably in the form of an expression vector, 
which includes a self-replicating extra-chromosomal vector such as a plasmid, or a vector 
15 that integrates into a host genome, as for example described above in Section 4. The step of 
introducing the recombinant polynucleotide into the host cell may be effected by any 
suitable means including transfection, and transformation, the choice of which will be 
dependent on the host cell employed. Such methods are well known to those of skill in the 
art. 

20 Recombinant polypeptides of the invention may be produced by culturing a host 

cell transformed with an expression vector containing nucleic acid encoding a polypeptide, 
biologically active fragment, variant or derivative according to the invention. The 
conditions appropriate for protein expression will vary with the choice of expression vector 
and the host cell. This is easily ascertained by one skilled in the art through routine 

25 experimentation. 

Suitable host cells for expression may be prokaryotic or eukaryotic. One preferred 
host cell for expression of a polypeptide according to the invention is a bacterium. The 
; bacterium used may be Escherichia colL Alternatively, the host cell may be an insect cell 
such as, for example, SF9 cells that may be utilised with a baculovirus expression system. 
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The recombinant protein may be conveniently prepared by a person skilled in the 
art using standard protocols as for example described in Sambrook, et al, MOLECULAR 
CLONING. A LABORATORY MANUAL (Cold Spring Harbor Press, 1989), in particular 
Sections 16 and 17; Ausubel et al y CURRENT PROTOCOLS IN MOLECULAR 
5 BIOLOGY (John Wiley & Sons, Inc. 1994-1998), in particular Chapters 10 and 16; and 
Coligan et al y CURRENT PROTOCOLS IN PROTEIN SCIENCE (John Wiley & Sons, 
Inc. 1995-1997), in particular Chapters 1, 5 and 6. 

Alternatively, the polypeptide, fragments, variants or derivatives of the invention 
may be synthesised using solution synthesis or solid phase synthesis as described, for 
10 example, in Chapter 9 of Atherton and Shephard (supra) and in Roberge et al (1995, 
Science 269: 202). 

& Antigen-binding molecules 

The invention also contemplates antigen-binding molecules that bind specifically 
to the aforementioned polypeptides, fragments, variants and derivatives. Preferably, an 
1 5 antigen-binding molecule according to the invention is immuno-interactive with any one or 
more of the amino acid sequences set forth in SEQ ID NO: 4, 6, 8, 10, 12, 14, 16, 18, 20, 
22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 4S, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 
70, 72, 74, 76, 78, 80, 83, 87, 89, 91, 93, 95, 99, 101, 103, 105 and 107, or variaits thereof. 

For example, the antigen-binding molecules may comprise whole polyclonal 
20 antibodies. Such antibodies may be prepared, for example, by injecting a polypeptide, 
fragment, variant or derivative of the invention into a production species, which may 
include mice or rabbits, to obtain polyclonal antisera Methods of producing polyclonal 
antibodies are well known to those skilled in the art. Exemplary protocols which may be 
used are described for example in Coligan et al, CURRENT PROTOCOLS IN 
25 IMMUNOLOGY, (John Wiley & Sons, Inc, 1991), and Ausubel et al, (1994-1998, supra), 
in particular Section HI of Chapter 11. 

In lieu of the polyclonal antisera obtained in the production species, monoclonal 
antibodies may be produced using the standard method as described, for example, by 
Kohler and Milstein (1975, Nature 256, 495-497), or by more recent modifications thereof 
30 as described, for example, in Coligan et al, (1991, supra) by immortalising spleen or other 
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antibody producing cells derived from a production species which has been inoculated with 
one or more of the polypeptides, fragments, variants or derivatives of the invention. 

The invention also contemplates as antigen-binding molecules Fv, Fab, Fab' and 
F(ab')2 immunoglobulin fragments. Alternatively, the antigen-binding molecule may be in 

5 the fonn of a synthetic stabilised Fv (scFv) fragment, a disulphide stabilised Fv (dsFv) 
fragment, a diabody (dAb), a minibody and the like, or may comprise non-immunoglobulin 
derived, protein frameworks. The antigen-binding molecules of the invention may be used 
for affinity chromatography in isolating a natural or recombinant polypeptide or 
biologically active fragment of the invention. For example reference may be made to 

10 immunoaffinity chromatographic procedures described in Chapter 9.5 of Coligan et al, 
(1995-1997, supra). The antigen-binding molecules can be used to screen expression 
libraries for variant polypeptides of the invention as described herein. They can also be 
used to detect polypeptides, fragments, variants and derivatives of the invention as 
described hereinafter. 

i 

15 "7. Identification of modulators 

The invention also contemplates a method of screening for an agent that 
■ modulates the expression of if gene selected irom xabB, xabA, oi ■ xabC, o» a gene 
belonging to the same regulatory or biosynthetic pathway as xabB, xabA, or xabC, or a 
variant of that gene, or that modulates the level and/or functional activity of an expression 

20 product of that gene or its variant. The method comprises contacting a preparation 
comprising said expression product (e.g., polypeptide or transcript), or a biologically active 
fragment thereof, or variant or derivative of these, or a genetic sequence that modulates the 
expression of said gene (e.g., the natural promoter relating to said gene, e.g., the xabB 
promoter, comprising the sequence set forth in SEQ ID NO: 81 or complement thereof), 

25 with a test agent, and detecting a change in the level and/or functional activity of said 
polypeptide or biologically active fragment thereof; or variant or derivative, or cf a product 
expressed from said genetic sequence. 

Modulators contemplated by the present invention includes agonists and 
antagonists of gene expression include antisense molecules, ribozymes and co-suppression 
30 molecules, as for example described in Section 2. Agonists include molecules which 
increase promoter activity or interfere with negative mechanisms. Agonists of a gene 
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include molecules which overcome any negative regulatory mechanism. Antagonists of 
polypeptides encoded by a gene of interest include antibodies and inhibitor peptide 
fragments. 

Candidate agents encompass numerous chemical classes, though typically they are 
5 organic molecules, preferably small organic compounds having a molecular weight of 
more than SO and less than about 2,500 Dalton. Candidate agents comprise functional 
groups necessary for structural interaction with proteins, particularly hydrogen bonding, 
and typically include at least an amine, carbonyl, hydroxyl or carboxyl group, preferably at 
least two of the functional chemical groups. The candidate agents often comprse cyclical 
10 carbon or heterocyclic structures and/or aromatic or polyaromatic structures substituted 
with one or more of the above functional groups. Candidate agents are also found among 
biomolecules including, but not limited to: peptides, saccharides, fatty acids, steroids, 
purines, pyrimidines, derivatives, structural analogues or combinations thereof. 

Small (non-peptide) molecule modulators of a polypeptide according to the 
1 5 invention, or portion, or domain or module thereof are particularly preferred. In iiis regard, 
small organic molecules typically have the ability to gain entry into an appropriate cell and 
affect the expression of a sene (e.g., by interacting with the regulatory region or 
transcription factors involved in gene expression); or affect the activity of a gene by 
inhibiting or enhancing the binding of accessory molecules. 

20 Alternatively, libraries of natural compounds in the form of bacterial, fungal, plant 

and animal extracts are available or readily produced. Additionally, natural or synthetically 
produced libraries and compounds are readily modified through conventional chemical, 
physical and biochemical means, and may be used to produce combinatorial libraries. 
Known pharmacological agents may be subjected to directed or random chemical 

25 modifications, such as acylation, alkylation, esterification, amidification, etc. to produce 
structural analogues. Screening may also be directed to known pharmacologically active 
compounds and chemical analogues thereof. 

Screening for modulatory agents according to the invention can be achieved by 
any suitable method. For example, the method may include contacting a cell comprising a 
30 polynucleotide corresponding to a gene as defined above, with an agent suspected of 
having said modulatory activity and screening for the modulation of the level and/or 
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fiinctional activity of a protein encoded by said polynucleotide, or the modulation of the 
level of an expression product encoded by the polynucleotide, or the modulation of the 
activity or expression of a downstream cellular target of said protein or said expression 
product. Detecting such modulation can be achieved utilising techniques including, but not 

5 restricted to, ELISA, cell-based ELISA, filter-binding ELISA, inhibition ELISA, Western 
blots, irnmunoprecipitation, slot or dot blot assays, immunostaining, RIA, scintillation 
proximity assays, fluorescent immunoassays using antigen-binding molecule conjugates or 
antigen conjugates of fluorescent substances such as fluorescein or modarnine, 
Ouchterlony double diffusion analysis, immunoassays employing an avidin-biotin or a 

10 streptavidin-biotin detection system, and nucleic acid detection assays including reverse 
transcriptase polymerase chain reaction (RT-PCR). 

It will be understood that a polynucleotide from which a target molecule of 
interest is regulated or expressed may be naturally occurring in the cell which is the subject 
of testing or it may have been introduced into the host cell for the purpose of testing. 
1 5 Further, the naturally-occurring or introduced sequence may be constitutively expressed - 
thereby providing a model useful in screening for agents which down-regulate expression 
of an encoded product of the sequence wherein said down regulation can be at the nucleic 
... acid or expression product level - or. n*y.j»quu* aciivsJu* - tucufcy providing a mode! 
useful in screening for agents that up-regulate expression of an encoded product of the 
20 sequence. Further, to the extent that a polynucleotide is introduced into a cell, that 
polynucleotide may comprise the entire coding sequence which codes for a target 
polypeptide or it may comprise a portion of that coding sequence (e.g. a domain or module 
as herein described) or a portion that regulates expression of a product encoded by the 
polynucleotide (e.g., a promoter). For example, the promoter that is naturally associated 
25 with the polynucleotide (ie. the xabB promoter) may be introduced into the cell that is the 
subject of testing. In this regard, where only the promoter is utilised, detecting modulation 
of the promoter activity can be achieved, for example, by operably linking the promoter to 
a suitable reporter polynucleotide including, but not restricted to, green fluorescent protein 
(GFP), luciferase, B-galactosidase and catecholamine acetyl transferase (CAT). Modulation 
30 of expression may be determined by measuring the activity associated with the reporter 
polynucleotide. 
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In another example, the subject of detection could be a downstream regulatory or 
biosynthetic target of the target molecule, rather than target molecule itself or the reporter 
molecule operably linked to a promoter of a gene encoding a product the expression of 
which is regulated by the target protein. 

5 These methods provide a mechanism for performing high throughput screening of 

putative modulatory agents such as proteinaceous or non-proteinaceous agents comprising 
synthetic, combinatorial, chemical and natural libraries. These methods will also facilitate 
the detection of agents which bind either the polynucleotide encoding the target molecule 
or which modulate the expression of an upstream molecule, which subsequently modulates 

10 the expression of the polynucleotide encoding the target molecule. Accordingly, these 
methods provide a mechanism of detecting agents that either directly or indirectly 
modulate the expression and/or activity of a gene or expression product according to the 
invention. 

& Production of secondary metabolites 

15 The present invention further relates to a process for enhancing the level and/or 

functional activity of secondary metabolites, preferably albicidins, using one or more 
agents selected nbni the pfoiynuci^iides, polypeptides, fragments, variants, derivative*, 
vectors and modulatory agents described above. The process in a preferred embodiment, 
includes the steps of stably transforming a host cell with an expression vector as broadly 

20 described above, comprising at least one nucleic acid sequence encoding a polypeptide of 
the invention or a biologically active fragment or variant or derivative of these and 
isolating transfonnants which produce an enhanced amount of antibiotics, which are 
preferably of the albicidin class. The vector optionally comprises a signal sequence for 
secretion recognised by the host cell. Illustrative secretory leaders include the secretory 

25 leaders of penicillinase, a-factor, immunoglobulin, T-cell receptors, outer membrane 
proteins, glucoamylase, fungal amylase and the like. By fusion in proper reading frame, the 
mature polypeptide may be secreted into the medium. The host cell may be a eukaryote or 
a prokaryote cell. In one embodiment, the cell naturally produces polyketides, preferably 
antibiotic polyketides and, in this regard, the cell is preferably X. albilineans or other 

30 bacteria capable of producing albicidins. Optionally, the construct may include a 
transcription regulating sequence, which is not subject to repression by substances present 
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in the growth medium. The above process may be used to prepare antibiotics directly or 
they may be used to prepare cell free extracts containing increased quantities of antibiotics, 
preferably of the albicidin class, for in vitro preparation of said antibiotics. Suitably, these 
cell free extracts may be prepared for example using the method disclosed by Dobrogosz, 

5 WJ. (1981) Enzymatic activity. In Manual of Methods for General Bacteriology 
(Gerhardt, P., ed) Washington, DC: American Society for Microbiology, pp. 365-392. In a 
preferred embodiment, a vector from which a phosphopantetheinyl transferase (PPTase) 
can be translated is also introduced into the host cell. Expression of PPTase 
polynucleotides has been shown to be important for the production of polyketides in 

10 heterologous expression systems. Preferably, the PPTase is selected from EntD and/or 
XabA as for example disclosed herein. If desired, a vector from which a methyltransferase, 
more preferably and 0-methyltransferase, and even more preferablv an S- 
adenosylmethionine (^methyltransferase can be translated may also be introduced into the 
host cell. An exemplary methyltransferase for this purpose is XabC as described herein. 

15 Alternatively, the expression hosts may be used as a source of increased quantities 

of antibiotics, which can be subsequently purified as for example disclosed by Birch et al 
in U.S. Patent No. 4,525,354. 

The invention also contemplates use of the polynucleotides, polypeptides, 
fragments, variant and derivatives of the invention in methods of combinatorial 

20 biosynthesis of novel antibiotics as for example disclosed by Khosla et al in U.S. Patent 
No. 5,712,146, Peterson et aL in U.S. Patent No. 5,783,431 and Betlach et al. in U.S. 
Patent No. 6,251,636 or in methods of producing antibiotics in hosts that ordinarily do not 
produce them as for example disclosed by Barr et al. in U.S. Patent No. 6,033,883. As 
discussed in Section 2.4, the invention contemplates albicidin PKS-NRPS derivatives with 

25 altered activities in one or more respects for the production of polyketides other than the 
albicidin natural produces) of the XabB. In this regard, expression vectors containing 
nucleotide sequences encoding a variety of such derivatives for the production of different 
polyketides are transformed into the appropriate host cells to construct a library. In one 
embodiment, a mixture of such vectors is transformed into selected host cells and the 

30 resulting cells plated into individual colonies and selected to identify successful 
transformants. A variety of strategies is available to obtain a multiplicity of colonies each 
containing a PKS gene cluster derived from the naturally occurring host gene cluster so 
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that each colony in the library produces a different PKS and ultimately a different 
polyketide, as for example disclosed by Betlach et al in U.S. Patent No. 6,251,636. The 
libraries thus produced can be considered at four levels: (1) a multiplicity of colonies each 
with a different PKS-NRPS encoding sequence; (2) the proteins produced from the coding 

5 sequences; (3) the polyketides produced from the proteins assembled into a functional 
PKS-NRPS; and (4) antibiotics or compounds with other desired activities derived from 
the polyketides. Colonies in the library can be induced to produce the relevant synthases 
and thus to produce the relevant polyketides to obtain a library of polyketides. Polyketides 
that are secreted into the media or have been otherwise isolated can be screened for 

10 binding to desired targets, such as receptors, signalling proteins, and the like. The 
supematants per se can be used for screening, or partial or complete purification of the 
polyketides can first be effected. Typically, such screening methods involve detecting the 
binding of each member of the library to receptor or other target ligand. Binding can be 
detected either directly or through a competition assay. Means to screen such libraries for 

15 binding are well known in the art. Alternatively, individual polyketide members of the 
library can be tested against a desired target. In this event, screens wherein the biological 
response of the target is measured can more readily be included. Antibiotic activity can be 
verified using typical screening assays such as those for albicidin set forth in Example 1. 

The invention also extends to the use of the polynucleotides, polypeptides, 
20 fragments, variant and derivatives of the invention for the synthesis of antibiotics, 
preferably antibiotics of the albicidin class. 

The polynucleotides of the invention encoding XabB, or a biologically-active 
fragment or variant thereof; together with a recombinant polynucleotide encoding a PPTase 
and/or an 0-methyltransferase which participate or which are capable of participating in 
25 the albicidin biosynthetic pathway, provide the means to engineer high level co-expression 
of the albicidin synthetase, its activating PPTase and modifying methyltransferase to obtain 
higher yields of albicidins. 

In order that the invention may be readily understood and put into practical effect, 
particular preferred embodiments will now be described by way of the following non- 
30 limiting examples. 
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EXAMPLES 
EXAMPLE 1 

Albicidin multifunctional synthase zene 
Materials and Methods 

5 Bacterial st rains and plasmids 

The properties of bacteria and plasmids used in this example are listed in Table 1. 

Media, culture conditions and antibiotics 

X albilineans strains were routinely cultured on SP medium (Birch & Patil, 
1985b) at 28° C. Escherichia coli DH5a and JM109 were used as hosts in cloning 

10 experiments and were grown on LB medium at 37° C (Sambrook et a/., 1989). Broth 
cultures were aerated by shaking at 200 r.p.m. on an orbital shaker. Modified YEB medium 
(Van Larebeke et al y 1977) for patch mating consisted of 10 mg ml" 1 peptone, 5 mg mL* 1 
yeast extract, 5 mg mL" 1 NaCl, 5 mg mL" 1 sucrose and 0.5 mg mL' 1 MgS04.7H 2 0. The 
following antibiotics were added to media as required: 50 jig kanamycin mL" 1 ; 15 fig 

1 5 tetracycline mLl" 1 ; 100 jig ampicillin mL" 1 . 

Routine genetic procedures 

Bacterial genomic DNA and plasmid DNA isolation, gel electrophoresis, DNA 
restriction digests, ligation reactions and transformation were performed by routine 
procedures (Sambrook et al y 1989). DNA fragments were excised from agarose gels and 
20 residual agarose was removed with the BRESAclean™ DNA purification kit (GeneWoiks, 
Adelaide). 

Construction of a X. albilineans partial genomic library 

Genomic DNA from X albilineans Xal3 was digested with EcoRI and size- 
fractionated. DNA fragments of 15 to 20 kb were ligated to dephosphorylated EcoKL- 
25 cleaved pBluescript SK IL The ligated DNA was electroporated into E. coli TOP10. 
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Transformants were selected on LB agar medium containing ampicilUn, and stored in LB 
broth with 15% glycerol at -70°C. 

pre am plification 

fiamffl-digested genomic DNA from X. albilineans LS157 was religated at low 
5 concentration (0.5 ug/mL) to generate circular DNA molecules as templates for inverse 
PGR. Three primers, one from the IS terminal region of Tn5 (B2: 5- 
CGGGATCCTCACATGGAAG TCAGATCCTG-3% and two flanking the unique BamHI 
restriction site of Tn5 (BL: 5'-GGGGACCTTGCACAGATAGC-3', and BR: 5'- 
CATTCCTGTAGCGGATGGAGATC-3'), were used to amplify the sequences flanking 
10 the Tn5 insertion in the genome of LS157. The amplified fragments (1.4-kb and 6.0-kb) 
were cloned into pZErO-2, yielding pZDL and pZIR (Figure I). 

PGR was performed in a volume of 50 pi with 200 ng of genomic DNA (or 10 ng 
of plasmid DNA), 0.4 ng/uL of each of primer, 0.2 mM of each dNTP, 1.8 mM Mg 1 *, and 
1 unit of elongase enzyme mix (Life Technologies). A 10-min initial denaturation step at 
15 94« C was followed by 35 thermal cycles of denaturation at 94° C for 1 min, anneahng at 
55° C for 1 min, and extension at 72° C for 1 min per 1 kb of expected axnphficatxon. 

r»ncw.tinn ofnm— *~ V™ h ™ and rinnnonidaK assay 

Plasmid pRG960sd contains a promoterless ^glucuronidase gene {uidA) 
downstream of a multiple cloning site (Van den Edde et al, 1992). Sequence upstream of 
xabB (nucleotide residues 1005 to 1210 or 521 to 1210) was amplified from pLXABB by 
PGR. Forward primer P1F1 (5 , -ACGCG£ATCCCAGCAGGGTGTCATACACG-3'), or 
P1F2 (5^TCG^G^ATCC_GCGCGATTGAAGTAGTCC-3') contained a BamHI 
restriction site (underlined). Reverse primer PIR (5'- 
TCCCCCG^QCGGCCAGCGTGGTGCTACTAC-3-) introduced a Xmal restnehon site 
(underlined). PGR fragments were ligated into BanimiXmal-c** pRG960sd, yieldmg 
P RG960pl and P RG960p2. These constructs were mobilised from E. coli DH5a mto X. 
albilineans LSI 55 as described below. 



20 
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Promoter strength was quantified by fluorometric analysis of glucuronidase 
activity (Jefferson, 1987; Xiao et al> 1992). The protein content in cell lysates was 
determined by the dye-binding method (Bradford, 1976) using a Bio-Rad protein assay kit. 

Bacterial conjugation 

5 DNA transfer between E. coli donor (JM109 pLAFR3 ± insert, or DH5ct 

pRG960sd ± insert) and X. albilineans recipient (LS157 or LS155) was accomplished by 
triparental transconjugation with helper strain pRK2013. Mid-log-phase cultures of the 
recipient were spotted onto agar plates containing YEB medium with no antibiotics (20 jiL 
per spot). After the liquid was absorbed by the agar, 20 uL of mid-log-phase culture of the 

10 helper was added to each spot. The liquid was again allowed to absorb, and 20 ^L of mid- 
log-phase culture of the donor was added to each spot After incubation of the mating 
plates overnight at 28° C, transconjugants were selected on SP plates supplemented with 
ampicillin, and tetracycline or spectinomycin. 

Assay and quantification of albicidi n production 

15 Albicidin was quantified by a microbial plate bioassay as described previously 

(Birch and Paiil, 1985b), except thai: the .10 mL basal lay ex of LB agax ciuu ihc 5ijiL 
overlayer of 50% LB with 1% agar were supplemented with tetracycline or spectinomycin, 
and E. coli DH5a pLAFR3 or pRG960sd was used as the indicator strain. This change 
avoided interference by tetracycline or spectinomycin, which were added to some cultures 

20 to ensure retention of pLAFR3 or pRG960sd derivatives mX. albilineans. Inhibition zone 
widths in the bioassay were converted to albicidin concentrations by interpolation on a 
dose-response plot produced under the same assay conditions. The plot fits the formula: 
Log [Alb] = 0.3 W - 0.92, where [Alb] is units of albicidin per 20 nL sample assayed, and 
W is the width in millimetres of the zone of growth inhibition surrounding each well. 

25 Results 

Cloning and sequencing of xabB gene required for albicidi n production 

Xanthomonas albilineans Tox~ mutant LSI 57 contains a single Tn5 insertion, in a 
4.1 kb C/al restriction fragment or a 16.5 kb EcoSl restriction fragment (Figure 1). 
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Selection for kanamycin resistance, following shotgun cloning of CM restriction 
fragments of LS157 DNA into pBluescript H SK, yielded clone pBC157. Sequences 
flanking the Tn5 insertion in LSI 57 DNA were amplified by inverse PCR, and -loned into 
pZErO-2, producing pZIL and pZDL Plasmid pLXABB was screened from a X. 
albilineans XaJ3 EcoRI genomic library with probes described in Figure IB. Subclones 
pSEBL and pSEBR were derived from pLXABB (Figure 1C, Table 1). 

The double-strand sequence of thel6,511 bp EcoRI genomic fragment in 
pLXABB was obtained by a primer-walking approach, using subclones pBC157, pZBL, 
pZIR, pSEBL, and pSEBR The Tn5 insertion in the genome of LS157 is accompanied by 
9-bp perfect repeat sequence (GTCCTGAAG), commencing at 2490 bp in GenBank 
accession no. AF239749. 



The only ORF longer than 900 bp within the 16.5-kb fragment is disrupted by the 
Tn5 insertion. This ORF (designated xabB) encodes a protein of 4081 aa (Mr 525,695). It 
commences at 1230 bp in GenBank accession no. AF239749 with a TTG codon, 6 bp 

15 downstream from a ribosome binding sequence (RBS) GAGG, which may impose post- 
transcriptional control on the rate of gene product formation (McCarthy and Gualerzi, 
1990). There is an alternative start codon (ATG) a forther 15 bp downstream. Of the 

' codons in this ORfT8>/o are rarely used in E. coli. The closest match (TTGAGC-14x- 
TATAAQ to the consensus -35 (TTGACA) and -10 (TATAAT) sequences for E. coli a 70 

20 promoters occurs 1 1 7 bp upstream of the translation initiation codon (Figure 2). 

Downstream by 35 bp from the TAG stop codon of xabB is a probable RBS 
(GAGG), separated by 6 bp from the ATG start codon of another ORF (designated xabQ 
in the same orientation as xabB. Overlapping the xabB promoter region is another probable 
promoter for a divergent transcript including a putative RBS (TGGAGG) and start codon 
25 for a gene designated xatA, separated by 233 bp from xabB (Figure 1 , 2). 

Complementat ion of xabB gene in LSI 57 

Mobilisation of pLAFR3, pLXABB 1 or pLXABB2 by bacterial conjugation into 
Tox" mutant LS157 occurred at a frequency of 1.5 x 10" 2 transconjugants/recipient cells. 
Albicidin production was undetectable in Tox - mutant LSI 57 and LSI 57 (pLAFR3) 
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controls, but introduction of the xafcfi gene on pLXABBl or pLXABB2 restored albicidin • ^ y/£ 

production to me level ofmewUd-type parental strain LS155 (Figure 4). 'V.v^ii'.' 

Functional analysis of xabB promo ter region /;>'^ t C ; y ; '.V^V' ; - ; ■ / ^v- 

GUS activity was undetectable in LS155 and LS155 (pRG960sd) controls. 
5 Plasmid pRG960pl or pRG960p2, with 206 bp or 690 bp from the xabB promoter region 
upstream of GUS, bom conferred GUS activity with no difference in expression level or 
pattern in X. albilineans LS155 (Figure 5). 

Discussion ■ ' ^'^V'-H^- •• 

Albicidin was partially characterised as a low-molecular-weight compound ^^{^)^0^v<L4^i. 

10 contains 38 carbon atoms with 3-4 aromatic rings (Birch and Pattt, 1985a). The compound ; i ; ; l/jy ,'■}■> ■ 
is not degraded by peptidases (Birch and Paul, 1985a), but it is cleaved by the AlbD 
esterase (Zhang and Birch, 1997). Based on the deduced functionality of the synthase 
describe herein, albicidin is likely to be a complex polyketide, condensed with amino 
acid(s), or nonproteinogenic amino, hydroxyl and carboxyl acid(s) by C-N, amide or ester 

15 bond formation. 

-4 

" The characterisation of XabB as a multi-modular hybrid enzyme provides new 
insights into the mechanism of albicidin biosynthesis and possible approaches to engineer 
the overproduction of albicidins. For example, the complementation experiments (Figure 
4) indicate that increased copy number of xabB stimulates early production of albicidin, i<(t ;.V. Vl 

20 but other factors become limiting during idiophase. It may be possible to iricrease'-' '.^i; ';" :' ,; 
expression of the albicidin synthase by modifications to the promoter and TTG start codon, 
or to improve albicidin yields by supplying candidate substrates (such as shikimate-derived 
units). The unusual enzyme organisation also o>ntributes to ttie emerging understanding of ' / 

how microbes generate structural diversity of antibiotics, and can facilitate combinatorial , ; ; 

25 engineering of antibiotics of mixed peptide/polyketide origin. 
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EXAMPLE 2 

Albicidin Antibiotic and Phvtotoxin Biosynthesis in Xan thomonas albilineans Requires a 
Phosphopantetheinvl Transferase Gene 

Materials and Methods 

5 Bacterial strains and plasmids 

The properties of bacteria and plasmids used in this Example are listed in Table 3. 

Media, culture conditions and antibiotics 

X albilineans strains were routinely cultured on SP medium (Birch & Patil, 
1985b) at 28° C. Escherichia coli DH5a and JM109 were used as hosts in cloning 

10 experiments and were grown on LB medium at 37° C (Sambrook et al. y 1989). Broth 
cultures were aerated by shaking at 200 r.p.m. on an orbital shaker. Modified YEB medium 
(Van Larebeke et al. 9 1977) for patch mating consisted of 10 mg ml" 1 peptone, 5 mg ml/ 1 
yeast extract, 5 mg ml/ 1 NaCl, 5 mg ml/ 1 sucrose and 0.5 mg ml/ 1 MgS0 4 .7H 2 0. The 
following antibiotics were added to media as required: 50 jag kanamycin ml/ 1 ; 15 jig 

15 tetracycline iuLI", 100 ^g ampicilim ml/ 1 . 



Assay of albicidin production 

Albicidin was quantified by a microbial plate bioassay as described previously 
(Birch and Patil, 1985b), except that the 10 mL basal layer of LB agar and the 5 mL 
overlayer of 50% LB with 1% agar were supplemented with tetracycline, and E. coli 
20 DH5a [pLAFR3] was used as the indicator strain. This change avoided interference by 
tetracycline, which was added to some cultures to ensure retention of pLAFR3 lerivatives 
mX albilineans. 



Routine genetic procedures 

Bacterial genomic DNA and plasmid DNA isolation, gel electrophoresis, DNA 
25 restriction digests, ligation reactions and transformation were performed by routine 
procedures (Sambrook et al. 9 1989). DNA fragments were excised from agarose gels and 
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residual agarose was removed with the BRESAclean™ DNA purification kit (GeneWorks, 
Adelaide). 

DNA sequencing and analysis 

Sequencing reactions were performed by dideoxynucleotide chain termination 

5 (Sanger et al> 1977) using the BigDye™ Terminator Cycle Sequencing Kit and 373A 
DNA sequencer (PE Applied Biosystems) through the Australian Genome Research 
Facility. Oligonucleotide primers were purchased from GeneWorks (Adelaide). University 
of Wisconsin Genetics Computer Group (UWGCG) programs BLASTP, FASTA, PILEUP, 
and BESTFTT were used through WebANGIS version 2.0 for DNA and protein sequence 

10 analyses of the GenBank, EMBL, PIR and SWISSPROT databases using standard defaults. 

rirmmp nf TnS flan king sequences 

EcoRI-digested genomic DNA from X. albilineans Tox* mutant LS156 was 
ligated into pBluescript II SK and electroporated into E. coli DH5a. Transformants were 
selected on LB medium containing kanamycin and ampicillin, yielding clone pBEAl, from 
1 5 which subclones pCEAl and pPEAl were obtained (Figure 1 ). 



Am plification of sequences from wild-type LS1SS bvPCR 

Sequences flanking the Tn5 insertion in LS156 were used to design primers (A1F: 
5*-TTTGGGTTGGATCGGGTAG-3' and AIR: S^CTTCTCGTCCTTG CTCTTC-S') 
for PCR-amplification of the corresponding wild type X. albilineans LS155 chromosomal 

20 DNA. PCR was performed in a volume of 50 \£L with 200 ng of genomic DNA, 0.4 ng 
Hi; 1 of each of primer, 0.2 mM of each of dNTP, 1.8 mM Mg^, and 1 unit of elongase 
enzyme mix (Life Technologies). A 4-min initial denaturation step at 94° C wes followed 
by 35 thermal cycles of denaturation at 94° C for 1 min, annealing at 55° C for 1 min, and 
extension at 72° C for 2 min. The amplified DNA fragment was cloned into pGEM-T to 

25 give pGTAl (Figure 1). 

Construction of expression vectors 

The coding region of the xabA gene was amplified from pGTAl by PCR Primer 
A1F1 fS^ GGAATTCC ATGCCC AATGCCGTACCG-3 ') contained an EcoRl restriction 
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site (underlined) for insertion of the amplified gene into the correct reading frame of lacL 
in pLAFR3. Primer A1R1 (5 '-C GGGATCCC GTGCTC ACC AGGCGTAGTGG-3 ') 
introduced a BamHl restriction site (underlined), 5 bases downstream from the stop codon 
of the amplified gene. The amplified DNA fragment was digested with EcoRI and BamHl, 
5 and ligated with tfcoRI/itamHI-digested pLAFR3 to result in pLXABA 

Similarly, the coding region of the entD gene was PCR-amplified from E. coli 
DH5a by colony PCR using primers EntDF (5- 
TCCCGGAATTCCATGGTCGATATGAAAACTACGC-3 , ) and EntDR (5'- 
GCC CAAGCTT CTAATCGTGTTGGCACAGCGTTATG-3'), then ligated into pLAFR3 
10 to produce pLENTD. The inserts in pLXABA and pLENTD were sequenced to confirm 
the expected clones. ; ,' . 

Bacterial trioarental mating 

DNA transfer between E. coli donor (JM109 pLAFR3 ± insert) and X. albtoneans 
recipient (LS155 or LS156) was accomplished by triparental transconjugation v/ith helper 

15 strain pRK2013. The mid-log-phase cultures of the recipient were spotted onto agar plates 
containing YEB medium with no antibiotics (20 pL per spot). After the liquid was 
absorbed by the agar, 20 pL of mid-log-phase culture of the helper was added to each spot. 
The liquid was again allowed to absorb, and 20 pi of mid-log-phase culture of the donor 
was added to each spot After incubation of the mating plates overnight at 28° C, 

20 transconjugants were selected on SP plates supplemented with tetracycline and ampicillin. 

Results 

Honin g and sequencing of the xabA gene re quired for alhicidin production 

Xanthomonas albtoneans Tox" mutant LS156 contains a single Tn5 insertion, in a 
3.0-kb Ecom restriction fragment (Wall & Birch, 1997). Selection for Tn5-encoded 
25 kanamycin resistance, following shotgun cloning of EcoRI restriction fragments of LS156 
DNA into pBluescript TL SK, yielded pBEAl (Figure 8). 

Both strands of the insert in pBEAl excluding the Tn5 insertion were sequenced 
by primer-walking from T3 and T7 vector sequences in pBEAl and subclones pCEAl and 
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pPEAl. The corresponding genomic region was amplified from wild-type X. albilineans 
LS155 by PCR, and cloned into pGEM-T to give pGTAl. Sequencing of pGTAl revealed 
that a 9-bp imperfect repeat sequence (TTGGCCACG) in the genome of LS156 
accompanied the Tn5 insertion (foUowing base number 1869 in Figure 9). The double- 
5 strand nucleotide sequence of the 2989 bp wild type EcoRl fragment is deposited in 
GenBank under accession no. AF191324. 

Reading frame analysis of the 3 kb EcoRl fragment revealed that only one ORF 
(designated xabA) is disrupted by the Tn5 insertion. This ORF encodes a protein of 278 aa 
(Mr 29 277), with 6.12% codons rarely used in E. coli. There were no close matches to E. 

10 coli -10 (TATAAT) and -35 (TTGACA) consensus promoter sequences, and no 
appropriately spaced RBS sequence (such as AGGA or GAGG) in the region upstream of 
the putative start codon ATG (Figure 9). A region of GC-rich dyad symmetry with a free 
energy of -10.2 kcals/mot was round, followed by two TCTC boxes that closely resemble 
the TCTG consensus sequence characteristic of many factor-independent termination sites 

15 (Brendel & Trifonov, 1984; Piatt, 1986) downstream of the TGA termination codon of 
xabA. 

Comparison of XabA with other bacterial PPTasss 

A search for proteins with homology to the deduced xabA product, using the 
FASTA and BLASTP and SWISSPROT programs, indicated regions of similarity to EntD 

20 fiom Escherichia coli (170 aa overlap, 35.9 % identity, 56.5 % similarity), Shigella 
flexneri (180 aa overlap, 35.0 % identity, 55.6 % similarity), Salmonella typhimurium (184 
aa overlap, 35.9 % identity, 62.0 % similarity), and Salmonella austin (172 aa overlap, 
36.1 % identity, 61.1 % similarity). XabA contains (V7I)G(V/nD and 
(F/W)(S/C/T)xKE(S/A)xxK domains characteristic of the phosphopantetheinyl transferase 

25 (PPTase) superfamily, and shares 17-36 % overall identity, 39-62 % overall similarity, 
with other bacterial PPTases (Table 4). 

Enhanced expression of xabA bv complementation in LSI 56 results in increased 
production of albicidins 

Mobilisation of pLAFR3 or pLXABA (pLAFR2:jcabA) by triparental matings 
30 into Tox" mutant LS156 occurred at a frequency of 1.5 x 10" 2 transconjugants/recipient 
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cells. Albicidin production was undetectable in Tox" mutant LS156 and LS156 (pLAFR3) 
controls, but introduction of the xabA gene on pLXABA enhanced albicidin production 
restored albicidin production (Figure 10). In LS156 (pLXABA), as in LS155, albicidin was 
first detectable in late-log-phase cultures (OD550 = 0.7) and was maximal in stationary 
phase. Albicidin production was not responsive to IPTG or glucose, and the lac promoter 
driving xabA in pLXABA is considered to express consututively in X. albilineans. The E. 
coli entD gene, expressed from the lac promoter in pLENTD, also complemented the 
xa&4::Tn5 mutation, restoring albicidin production in LS156. 



Discussion 

10 A gene required for albicidin production in X albilineans was isolated using a 

Tn5 mutagenesis and shotgun cloning approach. The ORF interrupted by Tn5 in Tox" 
mutant LSI 56 is designated xabA. This ORF was isolated from Tox + parent strain LS155, 
and shown to enhance albicidin production early in the production phase in LSI 56 when 
expressed from the lac promoter in pLAFR3. Tn5 insertions typically cause polar 

15 mutations affecting all downstream cistrons in an operon (De Bruijn and Lupski, 1984). 
Complementation of the mutation in LS156 by the isolated xabA ORF indicates the 
absence of any downstream cistron involved in albicidin production. There is no consensus 
RBS sequence close to the alternative start codons for this ORF in the X. albilineans 
genome. Translation may be initiated without an evident ribosome binding sequence 

20 complementary to the 3' end of the 16S rRNA, as observed for some streptomycete genes 
involved in secondary metabolism (Strohl, 1992), and for some chloroplast genes (Kozak, 
1999). 

PPTases play an essential role in priming polyketide, fatty acid, non-ribosomal 
peptide and siderophore biosynthesis (Gehring et al, 1997a; Lambalot et al, 1996; 

25 Marahiel et aL, 1997; Walsh et al., 1997). All polyketide synthase, fatty acid synthetases, 
and non-ribosomal peptide synthetases require post-translational modification to become 
catalytically active (Walsh et al., 1997). The inactive apo-proteins are converted to their 
active holo-forms by transfer of the 4'-phosphopantetheinyl (P-pant) moiety of coenzyme 
A to the sidechain hydroxyl of a serine residue in a conserved carrier domain (Lambalot et 

30 al, 1996; Walsh et al, 1997). The P-pant moiety serves to covalently tether the growing 
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product, which is assembled by sequential action of multiple catalytic domaiis in these 
complex synthetases (Walsh et al, 1997). 

A family of more than twenty PPTases is recognised by a common 
(V/I)G(V/I)Dx40-45...(F/W)(S/Cn-)xKE(A/S)xxK signature sequence, but overall V ; ;^ 

5 sequence homologies are low (Gehring et al, 1997; Lambalot et al, 1996; Nakano et al, . 
1992; Quadri et al., 1998a). In E. coli, there are two PPTases with distinct specificities: 
ACPS is active on acyl carrier protein (ACP) domains in fatty acid and polyketide 
synthase; EntD is active on peptidyl carrier protein (PCP) and aryl carrier protein (ArCP) 
domains in peptide synthetases (Lambalot et al, 1996; Walsh et al, 1997). Thus, PPTases 

10 may be partner-protein specific. However, Sfp from B. subtilis appears to be non-specific, ; 0. 

efficiently activating both fatty acid, polyketide synthase and peptide synthetases f^^^0 : t',^^i,j^ 
et al, 1998; Mofid et al, 1999; Quadri et al., 1998a). XabA includes the PPTase VGID • < ; 

and FSxKESxxK motifs. Although it has highest overall similarity to the peptide-selective 
EntD proteins, the sequence groupings are not sufficiently compelling to predict the J 

15 specificity of XabA for polyketide synthase or peptide synthetases (Table 4, Figure 11). 

Complementation studies have revealed substantial functional interchangeability 
of PPTases in different bacteria. For example, the B. sublitis sfp gene involved in surfactin 
biosynthesis complements mutants in E. coli entD (enterobactin biosynthesis) ai d B. brevis 
gsp (gramicidin biosynthesis) (Borchert et al., 1994; Grossman et al, 1993). In vitro, 

20 ACPS from E. coli activates apoproteins from Lactobacillus, Rhizobium and Streptomyces , 
(Lambalot et al, 1996). Because XabA shows highest smiilarity to EntD, we amplified the ^ 
en«D-coding region from E. coli, and arranged it for expression from toe toe promoto 
broad host-range vector pLAFR3. This construct (pLENTD) restored albicidin production 
in X. albilineans xaM::Tn5 mutant LS156. EntD is a peptide-selective PPTase that 

25 converts inactive apo-EntF and apo-EntB to active holo-enzymes involved in biosynthesis 
of enterobactin in E. coli (Gehring et al., 1997a). Functional complementation of the 
*aM::Tn5 mutation by entD indicates that XabA is a PPTase required for post- 
translational activation of synthetases involved in albicidin production in X. albilineans. 
The specificity of EntD for activation of peptide synthetases in E. coli indicates that 

30 albicidin biosynthesis probably involves an XabA-activated peptide synthetase. 
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Some PPTase genes involved in non-ribosomally synthesised peptide biogenesis 
are located near the genes encoding their targets (Quadri et al., 1998b). For example, B. 
brevis gsp, B. sublitis sfp, and E. coli entD genes all he within 4 kb of operons encoding 
the target peptide synthetases (Borchert et al., 1994; Coderre & Earhart, 1989; Nakano et 

5 al, 1992). However, M. tuberculosis pptTis not located near the mbt gene cluster encoding 
the target peptide synthetases involved in mycobactin biosynthesis (Quadri et al., 1998b). 
No gene encoding a PPTase has been identified in any of the antibiotic and phytotoxin 
biosynthetic gene clusters characterised from Streptomyces spp. (Gehring et al., 1997b) 
and Pseudomonas spp. (Bender et al, 1999). No evident target gene was found within 

10 1282 bp upstream or 870 bp downstream of xabA. Three cosmids spanning about 100 kb in 
two regions of the genome complemented 56 of 58 tested Tox" mutants of X. albUineans, 
but not LS156 (Rott et al., 1996). These results indicate that xabA is not clustered with the 
genes encoding the antibiotic synthetases that it activates. 

Expression of xabA (or an alternative PPTase such as entD) is essential for 
1 5 albicidin biosynthesis. The phosphopantetheinyl transferase gene described herein provides 
new insight into antibiotic biosynthesis in the Pseudomonadaceae, and new opportunities 
to understand and apply albicidins as potent inhibitors of prokaryote DNA replication. This 
g*>ie, together with tobxabB provide tSK^eawi rev engineer high level c^Apiession of the 
albicidin synthetase and its activating PPTase to obtain higher yields of afoicidins, and 
20 ultimately to manipulate the elements of this biosynthetic machinery, by mufc genesis or 
otherwise, to produce desired structural variants of this novel antibiotic class. They may 
also indicate a new approach to disease resistance, by engineering plants to interfere with 
the biosynthesis of albicidin toxins, which are key pathogenesis factors for the systemic 
development of leaf scald disease. 

25 EXAMPLE 3 

A methvltransferase f ?ene is involved in albicidin biosy n thesis in Xanthomonas albilineans 
Material and Methods 



Bacterial st rains and plasmids 

The properties of bacteria and plasmids used in this example are listed in Table 5. 
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Media, culture conditions and antibiotics 

X albilineans strains were routinely cultured on sucrose peptone (SP) medium at 
28° C (Birch and Patil, 1985b). Escherichia coli strains were used as hosts in cloning 
experiments and were grown on LB medium at 37° C (Sambrook et al y 1989). Broth 
5 cultures were aerated by shaking at 200 rpm on an orbital shaker. Modified YEB medium 
(Van Larebere et al., 1977) was used for patch mating. The following antibiotics were 
added to media as required: kanamycin, 50 ^mL; tetracycline, 15 jig/mL; ampicillin, 100 
figfrnL. 

Assay of albici din production 

10 Albicidin was quantified by a microbial plate bioassay as described previously 

(Birch and Patil, 1985b), except that the 10 mL basal layer of LB agar and the 5 mL 
overlayer of 50% LB with 1% agar were supplemented with tetracycline, and E. coli 
DH5a [pLAFR3] was used as the indicator strain. This change avoided interference by 
tetracycline, which was added to some cultures to ensure retention of pLAFR3 derivatives 

15 in X. albilineans. 

" Routine genetic uroceuaies 

Bacterial genomic DNA and plasmid DNA isolation, gel electrophoresis, DNA 
restriction digests, ligation reactions and transformation were performed by routine 
procedures (Sambrook et ai 9 1989). DNA fragments were excised from agarose gels and 
20 residual agarose was removed with the BRESAclean™ DNA purification kit (GeneWorks, 
Adelaide). 

DNA sequencing and analysis 

Sequencing reactions were performed by dideoxynucleotide chain termination 
(Sanger et aL, 1977) using the BigDye™ Terminator Cycle Sequencing Kit and 373 A 
25 DNA sequencer (PE Applied Biosystems) through the Australian Genome Research 
Facility. Oligonucleotide primers were purchased from GeneWorks (Adelaide). University 
of Wisconsin Genetics Computer Group (UWGCG) programs BLASTP, FASTA, PILEUP, 
and BESTFIT were used through WebANGIS version 2.0 for DNA and protein sequence 
analyses of the GenBank, EMBL, PIR and SWISSPROT databases. 
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Recovery of the downstream sequence of truncated xabC by IPCR 

Genomic DNA of X albilineans LSI 55 was digested with NcdL Following 
phenol/chloroform extraction and ethanol precipitation, the digested DNA was self-ligated 
at a concentration of 0.5 ^ig/mL. The Ugated DNA was precipitated with ethanol and 
5 resuspended in sterile H 2 0 to a concentration of 20 ng/|iL as template for IPCR. Sequence 
of the 16.5 kb EcoRl fragment including the 5' region ofxabC was used to design primers 
(IF: 5 '-AAGCGTCGAC ATAGCAGCAG-3 * and IR: 5'- 

CGGC AACGC ATTCG ACCTCG-3 *) for IPCR-amplification of the sequence downstream 
of the EcoRl site of truncated xabC gene. 

10 IPCR was performed in a volume of 50 jiL with 50 ng of template DNA, 0.4 

ng/jiL of each of primer, 0.2 mM of each of dNTP, 1.8 mM Mg*\ and 1 unit of elongase 
enzyme mix with proof-reading activity (Life Technologies). A 10 min initial denaturation 
step at 94° C was followed by 35 thermal cycles of denaturation at 94° C for 1 min, 
annealing at 55° C for 1 min, and extension at 72° C for 1 min per 1 kb of expected 

15 amplification product. The DPCR product was cloned into pZErO-2 to give pZDCC. Clones 
of construct pZIXC from three independent PCR reactions were sequenced to rule out the 
possibility of PCR-generated errors. 

Insertional mutagenesis 

An internal 625 bp Clal-EcoRI fragment of xabC (Figure 13) was firstly cloned 
20 into C/al/ZfcoRI-digested pBluescript II SK to provide a Kpnl restriction site, then 
subcloned into £coRI/A^nI-cleaved pJP5603 to yield pJP-BEC. The inserts in pBluescript 
II SK intermediates (pBEC) were sequenced to confirm the expected clones. 

The suicide construct pJP-BEC was transferred from the mobilising strain E. coli 
S17-1 (Xpir) into X albilineans LS155. Exconjugant colonies were selected on SP agar 

25 containing kanamycin and ampicillin. Insertional disruption in xabC or thp was verified by 
PCR using primers flanking the expected integration site of pJP-BEC or pJP-BAS and 
extension at 72° C for 1 min as previously described (Zhang and Birch, 1997b). The effect 
on albicidin biosynthesis was determined using the microbial plate assay. Representative 
(Tox") insertional mutants in xabC (LS-JP1) and thp (LS-JP2) were retained for further 

30 analysis. 
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Construction of expression vectors 

The coding region of the xabC gene was amplified from X albilineans LSI 55 
chromosomal DNA by PCR. Primer A3F (5 ^C GGGATCC CATGGATTCAGCGTTACC- 
3*) contained a BamHI restriction site (underlined) for insertion of the amplified gene into 

5 the correct reading frame of lacZ in pLAFR3. Primer A3R (5 ' -CC C AAGCTTT C ATT AT 
GGGGCCCTCTTGC-3 ') introduced a Hindm restriction site (underlined). The amplified 
DNA was digested with BamHI and Hindm, and ligated with £amffl//findm-digested 
pLAFR3 to result in pLXABC. X albilineans Tox" mutant LS157 contains a single Tn5 
insertion, in a 4.1 kb Clal restriction fragment or a 16.5 kb EcoRI restriction fragment 

10 (Figure 12). Selection for kanamycin resistance, following shotgun cloning of Clal 
restriction fragments of LS157 DNA into pBluescript H SK, yielded clone pBC157. 
Sequences flanking the Tn5 insertion in LS157 DNA were amplified by inverse PCR, and 
cloned into pZEiO-2, producing pZJL and pZDEL The double-strand sequence of the 16,51 1 
bp EcoRI genomic fragment in pLXABB was obtained by a primer-walking approach, 

15 using subclones pBC157, pZIL, pZIR, pSEBL, and pSEBR. The Tn5 insertion in the 
genome of LSI 57 is accompanied by 9-bp perfect repeat sequence (GTCCTGAAG), 
commencing at 2490 bp in GenBank accession no. AF239749. 

Genetic complementation of albicidin biosynthesis 

DNA transfer between E. coli donor (JM109 pLAFR3 ± insert) and X albilineans 

20 recipient (LS-JP1 or LS-JP2), was accomplished by triparental transconjugation with 
helper strain pRK2013. Mid-log-phase cultures of the recipient were spotted onto agar 
plates containing YEB medium with no antibiotics (20 per spot). After the liquid was 
absorbed by the agar, 20 fiL of mid-log-phase culture of the helper was added to each spot 
The liquid was again allowed to absorb, and 20 \xL of mid-log-phase culture of the donor 

25 was added to each spot After incubation of the mating plates overnight at 28° C, 
transconjugants were selected on SP plates supplemented with ampicillin, and tetracycline 
or spectinomycin. 

Transconjugants were tested for albicidin production using the microbial plate 
bioassay. The constructs pLXABB, pLXABC were designed to test complementation in 
30 trans. However, complementation could also occur in cis 9 by homologous recombination 
between the complementing construct and the insertionally mutated chromosomal gene. To 
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exclude this possibility, the retention of the insertion in xabC was confirmed by PCR, 
using primers from aphA (in the insertion) and xabB (adjoining xabC in the chromosome). 

Results and Discussion 

rinnin p and sequen cin g of the fi ill-lenpth xabC gene 

5 Downstream by 45 bp from the TAG stop codon of xabB is the start of an ORF 

(designated xabQ in the same orientation. The 639-bp sequence downstream of the EcoEl 
site of the truncated xabC was amplified from wt X. albilineans LSI 55 using IPCR. The 
double-strand nucleotide sequence of 1515 bp from the stop codon of xabB to the Ncol site 
downstream of xabC (Figure 13) is deposited in GenBank under accession no. AF239750. 
10 The xabC ORF encodes a protein of 343 aa (Mr 37,704). One TCTG-like sequence 
(TGTG) and one typical TCTG box characteristic of many fector independent termination 
sites (Brendel and Trifonov, 1984) occur downstream of the termination codon (TAA) of 
xabC (Fig. 2). However, the other features typical of such terminators (a region of GC rich 
dyad symmetry, followed by a run of consecutive thymine residues) are not present within 

1 5 435 bp downstream bf the xabC stop codon. 

YahP is similar to 0-me thY ltransferases 

The deduced product of xabC shows 22-30% overall identity and 52-60% overall 
similarity to a family of methyltransferases that utilise S-adenosyl-methionine (SAM) as a 
co-substrate for O-methylation of small molecules (Ingrosso et al., 1989; Haydock et al., 

20 1991; Kagan and Clarke, 1994). These enzymes include tetracenomycin polyketfde C-8 O- 
methyltransferase (TcmO, P39896) and C-3 O-methyltransferase (TcmN, P16559) of 
Streptomyces glaucescens, hydroxyneurosporene-O-methyltransferase (P17061) of 
Rhodobacterium capsulars, and hydroxymdole-O-memyltransferases of rat pineal and 
retina (009179) and chicken pineal gland (Q92056). Three highly conserved motife in 

25 SAM-dependent methyltransferases are also present in XabC as shown in Figures 13 and 
14. The crystal structure analysis for the methyltransferase-SAM complex (Schlukebier et 
al., 1995) provides firm structural evidence for the role of motif I in SAM binding. 
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Insertional mu ta genesis of xo hC. blocks albicidin biosynthesis 

Insertional mutation in xabC was accomplished using suicide-vector pJP-BEC and 
confirmed by PCR. Six out of eight tested transconjugants were verified by PCR to contain 
insertional mutations in xabC. Albicidin production was undetectable in these insertional 
5 mutants, compared to wt X. albilineans LS155 control. The other transconjugants may 
result fiom integration of the vector at other genomic locations by illegitimate 
recombinations as reported previously (Penfold and Pemberton, 1992). 



Complementation test 

Introduction of the xabC gene in pLXABC or the truncated xabC gene in : 

10 pLXABB into insertional mutant LS-JP2 restored albicidin production to the level of the , 
wt parental strain LSI 55. This indicates that xabC is essential for albicidin production in % 
albilineans. The truncated xabC in pLXABB (SEQ ID NO: 106) encodes 277 residues 
(SEQ ID NO: 107), including all of the three conserved motifs of SAM-methyltransferases, 
and appears fully functional by complementation. The continued presence of an insertion 

15 in the chromosomal locus was confirmed by PCR. Thus, complementation was operating 
in trans. This also indicates that no other cistron downstream of xabC is required for 
sMtitm "redaction, bscaus? wtssrtkaa! mutagenesis typically causss .pnlar rr-jtati?™ 
affecting all downstream cistrons in an operon (De Bruijn and Lupski, 1989). 

Pfihannad expression of xabC results in increa sed nroduction of albicidins 
20 Derivatives of X. albilineans strain LS155, in which an xabC gene, or fragment 

thereof, was introduced in trans, were tested for production of albicidin using the bioassay 
described above. The results, presented in Figure 15, show that expression of xabC cloned 
into pLAFR3 in derivatives of X. albilineans strain LS155 complements an insertional 
mutation in the chromosomal xabC, and also enhances albicidin production eirly in the 
25 production phase. Expression of the first part of the xabB operon, including the full-length 
xabB and a truncated but functional xabC, further enhances albicidin production. 
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The disclosure of every patent, patent application, and publication cited herein is 
hereby incorporated herein by reference in its entirety. 

The citation of any reference herein should not be construed as an admission that 
such reference is available as 'Trior Art" to the instant application 

5 Throughout the specification the aim has been to describe the preferred 

embodiments of the invention without limiting the invention to any one embodiment or 
specific collection of features. Those of skill in the art will therefore appreciate that, in 
light of the instant disclosure, various modifications and changes can be made in the 
particular embodiments exemplified without departing from the scope of the present 

10 invention. All such modifications and changes are intended to be included within the scope 
of the appended claims. 
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TABLES 
TABLE 1 



Bacterial strains, and plasmids for Example 1 



Strain or 


Relevant characteristics 


Reference or 


plasmids 




source 


Strains 






is. COll 






DH5a 


<D80dlacZAM15, A(lacZYA-argF 


Pi omega 


I JM109 


[F, lacPZAMlS], A(lac^roAB 


Promega 


i TOP10 


F, A(mrr-hsdRMS-mcrBC), A(are-leu)7697, AlacX74 


Invitrogen 


X. albilineans 








Wad-type albicidin producer from sugarcane (Queensland), Ap r 


Inventor's 


Xal3 


laKnratnrv 




Wad-type albicidin producer from sugarcane (Queensland), Ap r 


Wall and Birch 


LS155 


(1997) 




LS155::Tn5, albicidin deficient (Tox), Km r Sf Ap r 


Wall and Birch 


LS157 


(1997) 

... 


Plasmids 


' 




pBhiescript 

hsk 


ColEl origin, E. cob cloning vector, Ap 


Stratapene 


pZEiO-2 


ColEl origin, E. coli cloning vector, Km r 


Invitrogen 




ColElorigin, IncP, Tra* t helper plasmid, Km T 


Dittaetal 


pRK2013 


f 980) 
Stachelhaus 


pLAFR3 


RK2 origin, Tra*, Mob + , broad host-range cosrnid, Tc f 


at aL (1987) 


pRG960sd 


ColElorigin, broad host-range plasmid, contains promoterless uidA with 


Van den.Edde 


start codon and Shine-Dalgamo sequence, Sm r Sp r 


etal. (1992) 


pBC157 


9.9-kb Clal fragment carrying Tn5 and flanking sequences from LSI 57, 


This study 


in pBhiescript II SK, Km r Ap r 


pZJL 


1.4-kb fragment, inverse PCR amplified from LS157 in pZErO-2, Km r 


This study 


pZIR 


6.0-kb fragment, inverse PCR amplified from LSI 57 in pZErO-2, Km 1 


This study 


pZTI 


0.9-kb fragment, PCR amplified from LSI 57 in pZErO-2, Km r 


This study 


pXABB 


16.5-kb EcoRI fragment from Xal3 in pBluescript II SK, Ap r 


Tois study 


pSEBL 


7.9-kb EcoRI-Spel frament from pXABB in pBhiescript II SK, Ap r 


This study 


pSEBR 


8.6-kb EcoRI-Spel frament from pXABB in pBluescript H SK, Ap r 


This study 
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Strain or 


Relevant characteristics 


Reference or 


plasmids 




source 


pLXABBl 


16.5-kb EcoRI fragment from pXABB in pLAFR3 (xabB in the same 
direction as lac), Tc r 


This study 


P LXABB2 


16.5-kb EcoRI fragment from pXABB in pLAFR3 (xabB in the opposite 
direction to lac), Tc r 


This study 


pRG960pl 


206-bp BamHI-Xmal frament in pRG960sd, SmT Sp r 


This study 


P RG960p2 


690-bp BamHI-Xmal frament in pRG960sd, Sm r Sp T 


This study 
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T ABLE 2 



Comparison of conserved sequences in peptide synthetases andXabB 



Domain 


Core 


Sequence conserved in 
peptide synthetases 0 


Sequence in XabB 


Position m 
Xab (aa) 


AHenvlatinii 


Al 


L(T/S) YxBL 


WSYAQL 


3806-3811 




A2 


LKAGxAYL (V/L) P (L/l ) D 


FKAGACYVPID 


3851-3861 




A3 


LAYxxYTSG (S/T) TGxPKG 


LACVMVTSGSTGRPKG 


3917-3932 




A4 


FDxS 


rAVo 


3967-3970 




A5 


NxYGPTE 


NNYGCTE 


4U03-4UOV 




A6 


GELxIxGxG (V/L) ARGYL 


GELHVHSVGMARGYW 


41144128 




A7 


Y (R/K) TGDL 


YKTGDM 


4152-4157 




A8 


GRxDxQVKIRGxRIELGEIE 


GRQDFEVKVRGHRVDTRQ 
VE 


41704189 




A9 


LPXYM(I/V)P 


LPTYMLP 


42394245 




A10 


NGK{V/L)DR 


NGKLDR 


4259-4264 












Pcptidyl carrier 
protein 


PCP 


DxFFxLGG (H/D) S (L/I) 


DNFFALGGHSL 
MDFFAVGGHSV 


4306-4316 

3261-3271 

~ ' ... 










Condensation 


CI 


SxAQxR(L/M) (W/Y)xL 


TYAQERLWLV 
SIjFQERLWFV 


3333-3342 
4374-4383 




C2 


RHExLRTxP 


RHEVLRTRF 
RHEILRTRF 


3381-3389 
44214429 




C3 


MHHxISDG(W/V)S 


IHHIISDGWS 
MHHLIYDAWS 


3456-3465 
4498-4507 




C4 


YxD(P/Y)AVW 


YADYALW 
YADYAIW 


3495-3501 
4538^544 




C5 


(I/V)GXFVNT(Q/L) (C/A)xR 


IGFFINILPLR 
IGFFINILPLR 


3606-3617 
46494659 




C6 


(H/N)QD(Y/V) PFE 


HQSVPFE 
NQALPFE 


3641-3647 
4685-4691 




C7 


RDxSRNPL 


RDSSQIPL 
RDTSRIPL 


3658-3665 
47014708 



°Sourced from reference (Marahiel et al, 1997). 



WO 02/024736 PCT/AU01/01 190 

-96- 

T ABLE 3 



Bacterial strains, and plasmids for Example 2 



Strain or 
plasmids 


Relevant characteristics 


Reference 
or source 


Strains 






£. coli 






DH5a 


(pgOd'flcTAMi'i, recAl, enoAl. gyrAl/o. im-i, nsoiu/ir k , j supc***, 
relAl, deoR, A(lacZYA-argF)U169 


Promega 


JM109 


[F, traD36, proAB, lacI q ZAM15], recAl, endAl, gyrA96, tbi hsdR17(r k ; 
nO, supE44, relAl, A(lac-proAB) 


Promega 


X. albilineans 






Xal3 


Wild-type albicidin producer from sugarcane (Queensland), Ap r 


This 

laboratory 


LS155 


Wild-type albicidin producer from sugarcane (Queensland), Ap r 


Wall& 
Birch (1997) 


LS156 


LS155;:Tn5, albicidin deficient (Tox - ), Km r St* Ap r 


Wall& 
Birch (1997) 








pBluescript II 
SK 


ColEl origin, E. coli cloning vector, Ap r 


Stratagene 


pGEM-T 


ColEl origin, E. coli TA-cloning vector, Ap r 


Promega 


pRK2013 


ColEl origin, IncP, Tra + , helper plasmid, Km r 


DittaetaL 
(1980) 


pLAFR3 


RK2 origin, Tra", Mob\ broad host-range cosmid, Tc r 


Sbskawicz 
etaL(1987) 


pBEAl 


8.8-kbEcoRI fragment carrying Tn5 and flanking sequences from 
LSI 56, in pBluescript II SK, Km r Ap r 


This study 


pCEAl 


1766-bp EcoRl-Clal fragment from pBEAl in pBluescript D SK, Ap r 


This study 


pPEAl 


697-bp EcoKL-Pstl fragment from pBEAl in pBluescript II SK, Ap r 


This study 


pGTAl 


2:l-kb fragment, PCR amplified from LSI 55 in pGEM-T, Ap r 


This study 


pLXABA 


834-bp EcoRl BamHl fragment (xabA ORF) from pGTAl in pLAFR3, Tc r 


This study 


pLENTD 


630-bp EcoRl-Hindm fragment (entD ORF) fromDH5D in pLAFR3, Tc r 


Ihis study 
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T ABLE 4 

Similarity between XahA an d nther PPTases involved in antibiotic and fatty 



biosynthesis in bacteria 



Pathway 


Protein 


Organism 


Specificity 


Domain I Domain 11 


Homology 








(A/P)t 




(nVSIM) 


Albicidin 


XahA 


X.albilineans 


? 


GVGIDLBRP-- (X) 39--PSAKBSLFKA7 Y 


- 


Bnterobactin 


EntD 


E.coli 


Pt 


PIGIDIKBI-- (X) 36 - - FSAKESAFKASE 


35.9/56.5 






S . f lexneri 


? 


PIGVDIEBI-- (X) 36- -FSAKESAFKAS"? 


35.0/55.6 






S . typhimur ium 


? 


RIGIDIBKI-- (X)35~ PSAKBSVYKAFQ 


35. 9/62 JO 






S.austin 




■ ■■■■ . * j v < ' ecKvucwnvn 

RVGVDIKKI — IX; -ri»iULis^v ir**t v ■> 


36.1/61;! 


Mycobactin 


PptT 


M. tuberculosis 


p 


SVGIDAEPH— (X) 34— PCAKBATYKAWF 


30.5/55.5 


Surf act in 


Sfp 


B.subtilis 


A/P* 


PIGIDIKKT— (X) 35--WSMKESFTKQK3 


24.8/48.5 




Psf-1 


B.pumiluo 


? 


PVGIDIEEI-- (X) 3 5 - - WSMKBAFI KLTG 


19.8/47.6 


Gramicidin 


Gap 


B.brevis 


p* 


PVGIDIERI-- (X) 35 - -WTIKESYIKAIG 


20.8/42.0 


Iturin A 


Lpa-14 


B. subtil is 


? 


PIGIDIBKM — (X) 35 — WSMKESFIKQAG 


20. 0/43. 4 


Fatty acids 


HI0152 


H. influenzae 


? 


AVGIDIKFP— (X) 34--WCLRBAVLKSQG 


19.7/45.7 




AcpS 


B. coli 


A* 


GUGTD1.VB1-- (X)40— FAVXBAAAKAFG 


16.bAft.B 






M. tuberculosis 


A 


GVGIDIiVSI-- (X) 41--WAAKEAVIKAWS 


25.7/47.6 






B. Bubtilis 


? v 


GIGLDITBL — (X) 41- -FAAKEAFSKAFG 


25.5/46.2 


PPTase domaii 


1* 






(V/I)G(I/V)D (P/W) (S/C/T>XKE<S/A)XXK ; 
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TABLES 



Bacterial strains, and plasmids for Example 3 



Strain or 

nl/ixmidx 


Characteristics 


Reference or 
source 


Strain 






£. coil 






DH5a 


<P80dlacZAMI5, A(lacZYA-argr}Uloy 




JM109 


[F, lacrZAMIS], A(lac-proAB) 




TOP10 


F, A(irar-hsdRMS-mciBC), A(are-leu)7W/7» AlacA/4 


Invitrogen 


S17-lXpir 


S17-1 lysogenized with Xpir 


r enioio ana 
Pernberton(1992) 


X albilineans 






Xal3 


wt albicidin producer from sugarcane (Queensland), Ap 


wur laDoraiory 


LS155 


wt albicidin producer from sugarcane (Queensland), Ap r 


Wall an/1 Rirrli 

wau ana ditcd 
(1997) 


LS157 


xabB::Tn5, albicidin deficient (Tex ), Km St Ap r ' 


wall ana Bircfc 
(1997) 


LS-JPl 


thp::pJP-BAS, albicidin deficient (Tox"), Km r Ap r 


This work 


LS-JP2 


xabC::pJP-BEC, albicidin deficient (Tox - ), Km r Ap r 


This work 


Plasmids 






pBluescript II 
SK 


ColEl origin, E. coli cloning vector, Ap r 


Strata gene 


pZErO-2 


ColEl origin, £. coli cloning vector, Km r 


Invitrogene 


pRK2013 


ColEl origin, IncP, Tra*, helper plasmid, Km r 


Ditta etal. (1980) 


pLAFR3 


RK2 origin, Tra", Mob\ broad host-range cosmid, Tc r 


Staskawicz et al 
(1987) 


pJP5603 


Bacterial suicide vector, Km r 


Penfbld and 
Pernberton(1991) 


pZDCC 


1 kb EPCR product in pZErO-2, Km r 


This work 


pBAS 


278 bp Apal-Sall fragment of thp in pBluescrpt II SK, Ap r 


This work 
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Strain or 
plasmids 


Characteristics 


Reference or 
source 


pJP-BAS 


284 bp Sall-Kpnl fragment from pBAS in pJP5606, Km r 


This work 


pBEC 


625 bp Clal-EcoRI fragment of xabC in pBluescript II SK, Ap r 


This work 


pJP-BEC 


655 bp EcoRI-Kpnl fragment from pBEC in pJP5603, Km r 


This work 


pLTHP 


1226 bp EcoRI-BamHI fragment from pLXABB in pLAFR3, 
Tc r 


This work 


pLXABC 


1029 bp xabC ORF amplified from LS155 in pLAFR3, Tc r 


This work 


pLXABB 


16.5 kb EcoRI fragment from Xal3 in pLAFR3, Tc r 


This work 
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CLAIMS 

1. An isolated polypeptide comprising at least one domain selected from the group 
consisting of: 

(a) an acyl-CoA ligase (AL) domain comprising a sequence set forth in any one or 
5 more of SEQ ID NO: 6 and 8, or variants thereof. 

(b) a 04eetoacyl synthase (KS) domain comprising a sequence set forth in any one or 
more of SEQ ID NO: 10, 12, 14, 16, 18 and 20, or variants thereof; 

(c) a /3-ketoacyl reductase (KR) domain comprising the sequence set forth SEQ ID 
NO: 22, or variants thereof; 

10 (d) an acyl carrier protein (ACT) domain comprising a sequence set forth in any one 

or more of SEQ ID NO: 24, 26 and 28, or variants thereof; 

(e) an adenylation (A) domain comprising a sequence set forth in any one or more of 
SEQ ED NO: 30, 32, 34, 36, 38, 40, 42, 44, 46 and 48, or variants thereof. 

(f) a peptidyl carrier protein (PCP) domain comprising a sequence set forth in any 
15 one or more of SEQ ID NO: 50 and 52, and variants thereof; and 

(g) a condensation (C) domain comprising a sequence set forth in any one or more of 
SEQ ID NO: 54, 56, 58, 60, 62, 64,<66, 68, 70, 72, 74, 76, 78 and 80, or variants 
thereof. : .-• . -.. - . v v : 

2. The polypeptide of claim 1, wherein the AL domain comprises each of the sequences 
20 set forth in SEQ ID NO: 6 and 8, or variants thereof. 

3. The polypeptide of claim 1, wherein the KS domain comprises each of the sequences 
set forth in SEQ ID NO: 10, 12 and 14, or variants thereof. 

4. The polypeptide of claim 1, wherein the KS domain comprises each of the sequences 
set forth in SEQ ID NO: 16, 18 and 20, or variants thereof. 

25 5. The polypeptide of claim 1 , wherein the A domain comprises each of the sequences set 
forth in SEQ ID NO: 30, 32, 34, 36, 38, 40, 42, 44, 46 and 48, or variants thereof. 

6. The polypeptide of claim 1, wherein the C domain comprises each of the sequences set 
forth in SEQ ID NO: 54, 56, 58, 60, 62, 64 and 66, or variants thereof. 

7. The polypeptide of claim 1 , wherein the C domain comprises each of the sequences set 
30 forth in SEQ ID NO: 68, 70, 72, 74, 76, 78 and 80, or variants thereof. 
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8. The polypeptide of claim 1, wherein the domains are arranged in an N- to C-tenninal 
direction as follows: AL-ACP-KS-KR-ACP-ACP-KS-PCP-C-A-PCP-C. 

9. The polypeptide of claim 1, comprising the sequence set forth in SEQ ID NO: 2, or 
biologically active fragment thereof, or variant or derivative of these. 

5 10. The polypeptide of claim 9, wherein the variant has at least 60% sequence identity to 
the sequence set forth in SEQ ID NO: 2. 

1 1. The polypeptide of claim 9, wherein the biologically active fragment is at least 6 amino 
acids in length. 

12. An isolated polypeptide comprising at least a biologically active fragment of the 
10 sequence set forth in SEQ ID NO: 2 or variant or derivative thereof. 

13. The polypeptide of claim 12, wherein the biologically active fragment is at least 6 
amino acids in length. 

14. The polypeptide of claim 12, wherein the biologically active fragment comprises at 
least one domain selected from the group consisting of: 

15 (a) an acyl-CoA ligase (AL) domain comprising a sequence set forth in any one or 

more of SEQ ID NO: 6 and 8, or variants thereof. 

(b) a /Wcetoacyl synthase (KS) domain comprising a sequence set forth in any one or 
more of SEQ ID NO: 10, 12, 14, 16, 18 and 20, or variants thereof; 

(c) a 0-ketoacyl reductase (KR) domain comprising the sequence set forth SEQ ID 
20 NO: 22, or variants thereof; 

(d) an acyl carrier protein (ACP) domain comprising a sequence set forth in any one 
or more of SEQ ID NO: 24, 26 and 28, or variants thereof; 

(e) an adenylation (A) domain comprising a sequence set forth in any one or more of 
SEQ ID NO: 30, 32, 34, 36, 38, 40, 42, 44, 46 and 48, or variants thereof. 

25 (f) a peptidyl carrier protein (PCP) domain comprising a sequence set forth in any 

one or more of SEQ ID NO: 50 and 52, and variants thereof; and 

(g) a condensation (Q domain comprising a sequence set forth in any one or more of 
SEQ ID NO: 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78 and 80, or variants 
thereof. 
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15. The polypeptide of claim 13, wherein the AL domain comprises each of the sequences 
set forth in SEQ ID NO: 6 and 8, or variants thereof. 

16. The polypeptide of claim 13, wherein the KS domain comprises each of the sequences 
set forth in SEQ ID NO: 10, 12 and 14, or variants thereof. 

5 17. The polypeptide of claim 13, wherein the KS domain comprises each of the sequences 
set forth in SEQ ID NO: 16, 18 and 20, or variants thereof. 

18. The polypeptide of claim 13, wherein the A domain comprises each of the sequences 
set forth in SEQ ID NO: 30, 32, 34, 36, 38, 40, 42, 44, 46 and 48, or variants thereof. 

19. The polypeptide of claim 13, wherein the C domain comprises each of the sequences 
10 set forth in SEQ ED NO: 54, 56, 58, 60, 62, 64 and 66, or variants thereof. 

20. The polypeptide of claim 13, wherein the C domain comprises each of the sequences 
set forth in SEQ ID NO: 68, 70, 72, 74, 76, 78 and 80, or variants thereof. 

21. The polypeptide of claim 12, wherein the variant has at least 60% sequence identity to 
said at least a biologically active fragment. 

15 22. i fte polypeptide of claim 12, wherein the variant has at least f0% ^quencc tdztiity Xo 
any one of the amino acid sequences set forth in SEQ ID NO: 6, 8, 10, 12, 14, 16, 18, 20, 
22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 
70, 72, 74, 76, 78 or 80. 

23. An isolated polypeptide comprising at least biologically active fragment of the 
20 sequence set forth in SEQ ED NO: 83, or a variant or derivative thereof. 

24. The polypeptide of claim 23, wherein the biologically active fragment comprises at 
least one of the consensus PPTase sequence motifs set forth in SEQ ID NO: 89 or 93, or 
variant thereof. 

25. The polypeptide of claim 24, wherein the biologically active fragment comprises both 
25 the consensus PPTase sequence motifs set forth in SEQ ID NO: 89 or 93, or variant 

thereof. 

i 
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26. The polypeptide of claim 23, wherein the biologically active fragment comprises the 
intervening sequence between said consensus PPTase sequence motifs, which intervening 
sequence comprises the sequence set forth in SEQ ID NO: 91, or variant thereof. 

27. The polypeptide of claim 23, wherein the biologically active fragment comprises a 
5 contiguous sequence of amino acids contained within the sequence set forth in SEQ ID 

NO: 87, or variant thereof. 

28. The polypeptide of claim 23, wherein the biologically active fragment is at least 6 
amino acids in length. 

29. The polypeptide of claim 23, wherein the variant has at least 60% sequence identity to 
10 the sequence set forth in SEQ ID NO: 83. 

30. The polypeptide of claim 23, wherein the variant has at least 70% sequence identity to 
any one of the amino acid sequences set forth in SEQ ID NO: 87, 89, 91 or 93. 

31. An isolated polypeptide comprising at least biologically active fragment of the 
sequence set forth in SEQ ID NO: 95, or a variant or derivative thereof. 

15 32. The polypeptide of claim 31, wherein the biologically active fragment comprises at 
least one of the consensus methyltransiferase sequence motifs set forth in SEQ ID NO: 99, 
101 or 103, or variant thereof. 

33. The polypeptide of claim 31, wherein the biologically active fragment comprises all the 
consensus methyltransferase sequence motifs set forth in SEQ ID NO: 99, 101 and 103, or 

20 variant thereof. 

34. The polypeptide of claim 31, wherein the biologically active fragment comprises a 
contiguous sequence of amino acids contained within the sequence set forth in SEQ ID 
NO: 105, or variant thereof. 

35. The polypeptide of claim 31, wherein the biologically active fragment comprises a 
25 contiguous sequence of amino acids contained within the sequence set forth in SEQ ID 

NO: 107, or variant thereof. 

36. The polypeptide of claim 31, wherein the biologically active fragment is at least 6 
amino acids in length. 
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37. The polypeptide of claim 31, wherein the variant has at least 60% sequence identity to 
the sequence set forth in SEQ ID NO: 95. 

38. The polypeptide of claim 31, wherein the variant has at least 70% sequence identity to 
any one of the amino acid sequences set forth in SEQ ID NO: 99, 101 or 103. 

5 39. An isolated polynucleotide comprising a sequence encoding at least one domain 
selected from the group consisting of: 

(a) an acyl-CoA ligase (AL) domain comprising a sequence set forth in any one or 
more of SEQ ID NO: 6 and 8, or variants thereof. 

(b) a 0-ketoacyl synthase (KS) domain comprising a sequence set forth in any one or 
10 more of SEQ ID NO: 10, 12, 14, 16, 18 and 20, or variants thereof; 

(c) a 0-ketoacyl reductase (KR) domain comprising the sequence set forth SEQ ID 
NO: 22, or variants thereof; 

(d) an acyl canier protein (ACP) domain comprising a sequence set forth in any one 
or more of SEQ ID NO: 24, 26 and 28, or variants thereof; 

15 (e) an adenylation (A) domain comprising a sequence set forth in any one or more of 

SEQ ID NO: 30, 32, 34, 36, 38, 40, 42, 44, 46 and 48, or variants thereof. 

(f) a peptidyl carrier protein (PCP) domain comprising a sequence set forth in any 
one or more of SEQ E) NO: 50 and 52, and variants thereof; and - r 

(g) a condensation (C) domain comprising a sequence set forth in any one or more of 
20 SEQ ID NO: 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78 and 80, or variants 

thereof. 

40. The polynucleotide of claim 39, wherein the AL domain is encoded by a nucleotide 
sequence set forth in any one or more of SEQ ID NO: 5 or 7, or variants thereof 

41. The polynucleotide of claim 40, wherein the AL domain is encoded by a nucleotide 
25 sequence comprising each of the sequences set forth in SEQ ID NO: 5 and 7, or variants 

thereof. 

42. The polynucleotide of claim 39, wherein the KS domain is encoded by a nucleotide 
sequence set forth in any one or more of SEQ ID NO: 9, 1 1, 13, 15, 17 and 19, or variants 
thereof. 
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43. The polynucleotide of claim 42, wherein the KS domain is encoded by a nucleotide 
sequence comprising each of the sequences set forth in SEQ ID NO: 9, 11 and 13, or 
variants thereof. 

44. The polynucleotide of claim 42, wherein the KS domain is encoded by a nucleotide 
5 sequence comprising each of the sequences set forth in SEQ ID NO: 15, 17 and 19, or 

variants thereof. 

45. The polynucleotide of claim 39, wherein the KR domain is encoded by a nucleotide 
sequence set forth in SEQ ID NO: 21, or variant thereof. 

46. The polynucleotide of claim 39, wherein the ACP domain is encoded by a nucleotide 
10 sequence set forth in any one or more of SEQ ID NO: 23, 25 and 27, or variants thereof. 

47. The polynucleotide of claim 39, wherein the A domain is encoded by a nucleotide 
sequence set forth in any one or more of SEQ ID NO: 29, 31, 33, 35, 37, 39, 41, 43, 45 and 

47. or variants thereof. 

48. The polynucleotide of claim 47, wherein the A domain is encoded by a nucleotide 
15 sequence comprising each of the sequences set forth in SEQ ID NO: 29, 31, 33, 35, 37, 39, 

41. 43,_45 and .47. or variants thereof. 

49. The polynucleotide of claim 39, wherein the PCP domain is encoded by a nucleotide 
sequence set forth in any one or more of SEQ ID NO: 49 and 51, or variants thereof. 

50. The polynucleotide of claim 39, wherein the C domain is encoded by a nucleotide 
20 sequence set forth in any one or more of SEQ ID NO: 53, 55, 57, 59, 61 , 63, 65, 67, 69, 71, 

73, 75, 77 and 79, or variants thereof. 

51. The polynucleotide of claim 50, wherein the C domain is encoded by a nucleotide 
sequence comprising each of the sequences set forth in SEQ ID NO: 53, 55, 57, 59, 61, 63 
and 65, or variants thereof. 

25 52. The polynucleotide of claim 50, wherein the C domain is encoded by a nucleotide 
sequence comprising each of the sequences set forth in SEQ ID NO: 67, 69, 71, 73, 75, 77 
and 79, or variants thereof. 
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53. The polynucleotide of claim 39, comprising the sequence set forth in any one of SEQ 
ID NO: 1 or 3, or a biologically active fragment thereof at least 18 nucleotides in length, or 
a polynucleotide variant of these. 

54. The polynucleotide of claim 53, wherein the polynucleotide variant has at least 60% 
5 sequence identity to any one of the polynucleotides set forth in SEQ ID NO: 1 or 3. 

55. The polynucleotide of claim 53, wherein the polynucleotide variant is capable of 
hybridising to any one of the polynucleotides identified by SEQ ID NO: 1 or 3 under at 
least low stringency conditions. 

56. The polynucleotide of claim 39, wherein the polynucleotide variant comprises a 
10 nucleotide sequence encoding at least one said domain. 

57. The polynucleotide of claim 56, wherein the nucleotide sequence variant has at least 
60% sequence identity to any one or more of the sequences set forth in SEQ ID NO: 5, 7, 
9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 
57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77 and 79. 

15 58. The polynucleotide of claim 56, wherein the nucleotide sequence variant is capable of 
hybridising to any one. of the. suquencic ids^tificd by SEQ H> NO: 5, 7, IZ, IZ, 17, 
19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 
67, 69, 71, 73, 75, 77 and 79 under at least low stringency conditions. 

59. An isolated polynucleotide comprising a sequence encoding at least biologically active 
20 fragment of the sequence set forth in SEQ ID NO: 83, or a variant or derivative thereof 

60. The polynucleotide of claim 59, comprising the sequence set forth in any one of SEQ 
ID NO: 82 and 84, or a biologically active fragment thereof, or a polynucleotide variant of 
these. 

61. The polynucleotide of claim 59, comprising a contiguous sequence of nucleotides at 
25 least 18 nucleotides in length and contained within the sequence set forth in SEQ ID NO: 

86, or variant thereof. 

62. The polynucleotide of claim 59, wherein the polynucleotide variant has at least 60% 
sequence identity to any one of the polynucleotides set forth in SEQ ID NO: 82, 84 and 86. 
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63. The polynucleotide of claim 59, wherein the polynucleotide variant is capable of 
hybridising to any one of the polynucleotides identified by SEQ ID NO: 82, 84 and 86 
under at least low stringency conditions. 

64. The polynucleotide of claim 59, wherein the polynucleotide variant comprises a 
5 nucleotide sequence encoding at least one PPTase sequence motif selected from SEQ ID 

NO: 89 and 93, or variant thereof. 

65. The polynucleotide of claim 64, wherein the polynucleotide variant comprises a 
nucleotide sequence encoding the intervening sequence between the said consensus 
PPTase sequence motifs, said nucleotide sequence comprising the sequence set forth in 

10 SEQ ID NO: 91. 

66. The polynucleotide of claim 59, wherein the polynucleotide variant suitably comprises 
a nucleotide sequence encoding a contiguous sequence of amino acids contained within the 
sequence set forth in SEQ ID NO: 87, or variant thereof. 

67. The polynucleotide of claim 66, wherein the contiguous sequence is encoded by the 
15 sequence set forth in SEQ ID NO: 86, or nucleotide sequence variant thereof displaying at 

60% identity thereto. 

68. The polynucleotide of claim 64, wherein the PPTase sequence motif is encoded by a 
nucleotide sequence comprising the sequence set forth in any one of SEQ ID NO: 88 and 
92, or nucleotide sequence variant thereof displaying at 60% identity thereto. 

■ ' V 

20 69. The polynucleotide of claim 65, wherein the intervening sequence is encoded by the 
nucleotide sequence set forth in SEQ ID NO: 90, or nucleotide sequence variant thereof 
displaying at 60% identity thereto. 

70. The polynucleotide of claim 66, wherein the contiguous sequence is encoded by the 
sequence set forth in SEQ ID NO: 86, or nucleotide sequence variant thereof displaying at 

25 60% capable of hybridising thereto under at least low stringency conditions. 

71. The polynucleotide of claim 64, wherein the PPTase sequence motif is encoded by a 
nucleotide sequence comprising the sequence set forth in any one of SEQ ID NO: 88 and 
92, or nucleotide sequence variant thereof capable of hybridising thereto under at least low 
stringency conditions. 
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72. The polynucleotide of claim 65, wherein the intervening sequence is encoded by the 
nucleotide sequence set forth in SEQ ID NO: 90, or nucleotide sequence variant thereof 
capable of hybridising thereto under at least low stringency conditions. 

73. An isolated polynucleotide comprising a sequence encoding at least biologically active 
5 fragment of the sequence set forth in SEQ ID NO: 95, or a variant or derivative thereof. 

74. The polynucleotide of claim 73, comprising the sequence set forth in any one of SEQ 
ID NO: 94 and 96, or a biologically active fragment thereof, or a polynucleotide variant of 
these. 

75. The polynucleotide of claim 73, comprising a contiguous sequence of nucleotides 
10 contained within the sequence set forth in SEQ ID NO: 104, or variant thereof. 

76. The polynucleotide of claim 73, comprising a contiguous sequence of nucleotides 
contained within the sequence set forth in SEQ ID NO: 106, or variant thereof. 

77. The polynucleotide of claim 73, wherein the polynucleotide variant has at least 60% 
sequence identity to any one of the polynucleotides set forth in SEQ ID NO: 94, 96, 104 

15 and 106. 

78. The polynucleotide of claim 73, wherein the polynucleotide variant is capable of 
hybridising to any one of the polynucleotides identified by SEQ ID NO: 94, 96, 104 and 
106 under at least low stringency conditions. 

79. The polynucleotide of claim 73, wherein the polynucleotide variant comprises a 
20 nucleotide sequence encoding a methyltransferase sequence motif selected from any one or 

more of SEQ ID NO: 99, 101 and 103, or variant thereof. 

80. The polynucleotide of claim 79, wherein the methyltransferase sequence motif is 
encoded by a nucleotide sequence comprising the sequence set forth in any one of SEQ ED 
NO: 98, 100 and 102, or nucleotide sequence variant thereof displaying at least 60% 

25 identity thereto. 

81. The polynucleotide of claim 79, wherein the methyltransferase sequence motif is 
encoded by a nucleotide sequence comprising the sequence set form in any one of SEQ ID 
NO: 98, 100 and 102, or nucleotide sequence variant thereof capable of hybridising thereto 
under at least low stringency conditions. 
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82. An expression vector comprising the polynucleotide of any one of claims 39, 59 or 73, 
wherein the polynucleotide is operably linked to a regulatory polynucleotide. 

83. A host cell containing the expression vector of claim 82. 

84. A multiplicity of cell colonies, constituting a library of colonies, wherein each colony 
5 of the library contains an expression vector for the production of the polypeptide of claim 1 

or claim 12. 

85. A method for enhancing the level and/or functional activity of an albicidin, said 
method comprising: 

- introducing into an albicidin-producing host cell (1) an agent that modulates 
10 the expression of a gene encoding at least a portion of the polypeptide of claim 1 or 

variant or derivative thereof, or the level and/or functional activity of an expression 
product of said gene, or (2) a vector from which a polynucleotide encoding at least a 
portion of the polypeptide of claim 1 or variant or derivative thereof can be 
translated; 

15 and culmring the host cell for a time and under conditions sufficient to 

enhance the level and/or functional activity of said albicidin. 

86. The method of claim 85, further comprising introducing into said host ceil a vector 
from which a PPTase can be translated. 

87. The method of claim 86, wherein the PPTase is selected from EntD or XabA. 

20 88. The method of claim 85, further comprising introducing into said host cell a vector 
from which a methyltransferase can be translated. 

89. The method of claim 86, wherein the methyltransferase is XabC. 

90. An antigen-binding molecule that is immuno-interactive with the polypeptide of claim 
I or claim 12. 

25 91 . An antigen-binding molecule that is immuno-interactive with the polypeptide of claim 

23. 

92. An antigen-binding molecule that is immuno-interactive with the polypeptide of claim 
31. 
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93. A method of preparing a polynucleotide encoding a modified PKS, comprising using a 
nucleotide sequence encoding the polypeptide of claim 1 or claim 12 as a scaffold and 
modifying the portions of the nucleotide sequence that encode enzymatic activities, either 
by mutagenesis, inactivation, deletion, insertion, or replacement. 

5 94. A method for producing polyketides, comprising expressing the modified albicidin 
PKS encoding nucleotide sequence produced by the method of claim 93 in a suitable host 
cell to thereby produce a polyketide different from that produced by said polypeptide. 
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PlFl 

967 GGCGATGCGC GCACACTGCA GGTCCATCAC GCCACCTCCA GCAGGGTGTC 
CCGCTACGCG CGTGTGACGT CCAGGTAGTG CGGTGGAGGT CGTCCCACAG 
AIRACQLDMM RBS 
xatA^ 1 

1017 ATACACGGCC AGCGGATGCT GCAGGTTTTC CACTGGCAGG GCCACTGGCT 
TATGTGCCGG TCGCCTACGA CGTCCAAAAG GTGACCGTCC CGGTGACCGA 

-35 (P„») -10(P^s) 

1067 GTCGTAAGGG AAGCGGTGCC TTGAGCG CCG GTGCGGACAG TATARCGACA 
CAGCATTCCC TTCGCCACGG AACTCGCGGC CACGCCT GTC ATATT GCTGT 

-10 (P„*) 

1117 CGTTCCTTGG CCAAGCGCAC TGTCGGCACG GCCTTGCTGA TGCCGCCCAT 
GCAAGGAACCGGTTCGCGTG ACAGCCGTGC CGGAACGACT ACGGCGGGTA 
-35 OW 

1167 GTAGCCGCGC GCCTGGATCT CGCGTAGTAG CACCACGCTG GCCGGGATCC 
CATCGGCGCG CGGACCTAGA GCGCATCATC GTGGTGCGAC CGGCCCTAGG 

PIR 

--• - - ■* RBS | ' ^ x&LB 

1217 ATCGAGGGCG CGCTTGCCCA ATGCGCTCAT GCAGATAACT CTTGTAGCCG 
TAGCTCCCGC GCGAACGGGT TACGCGAGTA CGTCTATTGA GAACTACGGC 
MP NAL MQIT LVA 



FIGURE 2 



WO 02/024736 



3/15 



PCT/AU01/01190 



(i).AL 



TSGSSGESKGILLSH- -GYFRTGDL Xal-XabB(AL) 
TGGTTGVAKGAMLTH- -GWMATGDI Hin-LCFA 
TSGSTGTPKAVMLNH - - GWFETGDL Bsu-PksJ 
S SGSTGDPKGVMLTH - - GWVKTGDL Bsu-My C A(AL) 
S SGTTGLPKGVMLTH - - GWLHTGD I Pcr-ComL2 
TSGTTGRPKGWSAQ- -GWYRTGDL Sma-FkbB(AL) 
TSGTTGRPKGWSTQ- -GWFRTGDL Ame-RifA(AL) 
TSGTTGTPKGVLSTQ - - GWYRTGDL Shy-RapA(AL) 



(ii). KS 



GPSEVINSACSSSLVAL- -VKLHGTGTSL- -ALGHLGAAAG Xal-XabB (KS1) 
GPSIAVDTACSASLTAI - - 1 RAHGTGTVL - -NIGHAESAAG Xal-XabB (KS2) 
GPSLFVHTNCSSSLSAL- -VKAHGTGTLL- -NLGHLDTVAG Mxa-Tal 
GPAVTVDTACSSSLVAV- - IEAHGTGTKL- -NIGHLFEAAG Bsu-MycA 
GPAVTVDTACSSSLVAL- - VKAHGTGTRL- -NIGHAQAAAG Ser-EryAl 
GPAMTVDTACSSGI/TAL - -VKAHGTGTRL- -NIGHTQAAAG Ser-EryA3 
GPSVLVDTACSGGLTAL- -VBCHGTGTQA- -NIGHLBGASG Che-PKSl : 
GPSLAVDTACSSSLTAI - -LKAHGTGTAL- -NIGHCBS AAG Bsu-PksM 
GPSVAVDTACSSSLVAI - -VEAHGTGTLL - -NLGHTEAAAG MtU-PpsA 
GPSLTIDTACSSSLMAL- -VKAHGTGTKV- -NMGHPBPASG Chick-FAS 
GPSIALDTACSSSLLAL- -IKAHGTGTKV- -NMGHPEPASG Rat-FAS 
* * * 

(Active site cysteine) (Active site histidine) 

(iii). KR 



VYWIGGAGGLGEVLSEHLIRTYD . AQLIWIGR Xal-XabB 
VYV I S GGTG ALiARL FV AE I G KRATRATV I L V AR Mxa-Tal 
TVL.VTGGTGGVGGQIARWLARRG - APHI-LLVSR Ser-EryAl 
TVl*\rTGGTSGIGAIIIARWI*ARSG . AEHLVIXGR Scr-EryA3 
SYLLVGGVGGLGSATALAMSTRG . ARHLLLINR Che-PKSl 
SYIITGGLGGLGLFFASKLAAAG . CGRIVLTAR Mtu-MAS 
SYIITGGLGGFGLELAQWLIERG . AQKLVLTSR Chick-FAS 
SYI I TGGLGGFGLEUAR WLVLRG . AQRLVLTSR Rat-FAS 



(iv). ACP 

CELALDSLQCVR Xal-XabB(ACPl) 
EYYGVDSIVAIE Xal-XabB (ACP2) 
ESYGVDSIVIIE Xal-XabB (ACP3) 
IGFGLDSIMLTQ Bsu-MycA 
ERYGIDSIIITQ Mxa-Tal 
AELGVDSLSALE Ser-EryAl 
QDYG IDSLVAVE Che-PKSl 
IEYGLDSLGMLE Mtu-MAS 
ADLGLDSLMGVE Chick-FAS 
ADLGLDSLMGVE Rat-FAS 

* (Active site serine) 
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A.X. albilineans XdbB (4801 aa) 



N-terminus PKS NRPS 















AL ACP1 


KS1 KR ACP2 ACP3 


KS2 













B. B. subtilis MycA (3971 aa) 



AL ACP KS ACP 




C. Yersinia pestis HMWP1 (3163 aa) 









KS AT 


KR ACP 











D. Af. xanfAusTal (2392 aa) 



^^Erl^Ag^Glgj KS KR ACP KS 



E. B. subtilis PksorfX6 (4447 aa) 















KS DH KR ACP 


KS KR ACP 


KS 
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1054 TTCCCGCCCGAATAGGCGCAGGAAGCCAATAAGTATGGCAGCGCCCTTG^ 

1135 GCTCTCCTCCGCGTCCnt:CATCGCC»TTGCGCCCCT 

1216 CGCGACTCTGCGACACTAGCXICAATG TTO^ 

-10 RBS 

M P N A V 5 

1297 ACCGATGCAGGGCGCGCGGGGACTCCCGCAGCCGCAAGC^ 

PMQGARG I* P Q P Q A M M PGL P S V G G LSAG 32 

1378 CCAGCCATTGCAGTTGTCGTTAGCACCGGAACTGCA^ 

QPLQLSLAPBLQAAARSAHRHLLDDGT59 

1459 GGOGCT1TACCTGCTCKMGTTCGATACOGCGCAATTOGA 

ALYI.LAF DTAQ PD PGA P A A M A I A R PDS 86 

GAGGTGCT . 



1540 CATCGCCOGCAGOGT 

IARSVRKRQAKPI»PGRI»AARI*ALQEVIi 113 

XG21 GGGACCTGCGCAAGCGCAGGCAGATATTGC^^ 

GPAQAQADIAIGATRAPCW PAGSLGSI 140 

1702 TTCCCATTGOGAGGACTOCGCGGCCGCCATO^ 

SHCBDYAAAI AMAAGTRHG |V G I q I> B R P 167 



1783 AATCACACCCGCG 



XnTGCTGAGCATCGCAATCGATGCCGACGAAGCCGCTCGTCTGGCA 



ITPAARAALI«SIAIDADEAARLAKAAD 194 
TTnS 

1864 CGCGCAGTGGCOSCAAGACCrreCrarrGACCGCA 

AQWPQDLLLTAL A |K E S) LF@AAYSAV221 

154 5 -OGGACGCTftCrrTGG^^ 

GRYPDFSAARLCGIDLARQCLHLRLTE 248 



V G F A R L 



P D L V L T H 275 



278 



2026 GACACTCTGCGCGCAATTCXrrGGC 

TLCAQPVAGQVC 

2107 CTACGCCTGGTGAGCAOGCGGAC 
YAW* 

21 B8 AAGCTCTCCCCGCAGCCGCACTCGGCGGTGGCA . 
2269 TCGATTTCGGTGCCATCGACCAACTGCAGACTGGGGGCATCGACATAAA^ 
2350 TCCGCGCGTGCCTCGTGCGCOVGATCGGTGACATGGCCC^ 



FIGURE 9 



WO 02/024736 



10/15 



PCT/AU01/01190 




0 

Culture Age (days) 



FIGURE 10 



WO 02/024736 



11/15 



PCT/AU01/01190 



r- Eco-EntD 
L Sfl-EntD 

p Sau-EntD 

L sty-EntD 

Xal-XabA 

Mtu-PptT 

Bsu-Lpa-14 

Bsu-Sfp 

Bpu-Psf-1 

Bbr-Gsp 

Bsu-AcpS 

Eco-AcpS 

Mtu-AcpS 

nin-rnsisz 



FIGURE 11 



WO 02/024736 



12/15 



PCT/AU01/01190 



E B 



N 



IF1 
<- 

NBSB BB 



IRI 



E N 



thp 



^Tn5(LS157) 
pJP-BAS(LS-JPl) 



* pJP-BEC (LS-JP2) 



pLXABB 



pLTHP pLXABC 

i 2kb i 



FIGURE 12 



WO 02/024736 



13/15 



PCT/AU01/01190 



xsOAstopcodon , 

TACCAAAACCCOGCCGCCGTCACCCXTITCATOf^T^ 1TACCTACA I I. 1 uCA 1 i iaCCTTCCAICIH 1 1 1 ACACCACGCTTAAC 105 

;w*C HD5ALPT5APTP0LPYTT 
V N 20 ^ 

GCCTACTAT1X<»CTCCOGCAGTCAAGGCG^ 
CCBCC 225 

AYYRTA AVKAA1BLGLFDVVGQQGRTPAAIABACQASPR 
G 60 

C1»I 

hi rCGCATCCn I U.IAI J A CCTAG TATCGATO GCTTTTCTAOGODGCAA UUU lbt»LL ll» 1 rCTACATACATOCCAACATGGCCATGTA<yTGGATCGTAgI rCCCOGOGCTACC 
TGGGT 345 

I H I I» C Y Y I* V S IGFLRRNCGLPYIDRWKAKYLDRSS PGYL 
G 100 

GGCAGCATCAAGTTCCTGCTCTCGCCCTACATCAT^ 
ACGOG 465 

GS I K P I* L S PY IMSAPTDLTAVVRTCK I H LAQOGVVA P D H 
P 140 

QKglOGGTOGAATITOCACGCGCCftTOCCACOGATC^ tGATOCCCAATATCCTGTOCl I GCCCCC1GATCGCCCG A1 ICCTGTGCTOG ACC1O0CAC 

CCP CC 505 

QWVEPARAMAPKMALPSALIANMVSLPADRPIR pf V p V g 

pi G | 180 

HotiC 

1 

OtCGGCCICTTCQGCItTCGCCTTCGCGCAGaXTTCCGCCAG 
CCCAC 705 

| — g UPC XAPAORPROABVSPLONDNVLDVAR BMAQAAKV 
A 8 220 

(IR) HinelX 
QGA GCCCGl i ieCTOOCagX3UU3CCATrCGACCYCGAT^^ 1 UUU AC CAA H lU-IU-ACtJA l TTC^ TCAGGTCCATGCCGACCGCKrcr 
TGGCT 025 

BARPLPCWAFDLDTG P G Y P V I Z fl TNPLHBPDBVDCBRI 
L A 260 

BCCRI Motif II 

AACACCCGCGATCCCCIt M UtCGACOACGCCATOirCA t' 1 IaGCATGATGA'1 GCTGGCCA 

CCACC 945 ' 

K T R D A IlWDDGMVITI PBPIADBERSSPPLAATPSMKKLC 
T T 300 

Hotit III Clal 
LU^MXU a tfnCCrACACCTATACCC* 
1055 



OGTTC 

PACRSYTY5 DLBRHPRHAG PGHVELKS I PPALLKVVVSR 

if - 3-I&. , . .... 



ATCGAATCGGCGACATOCCCTC 
<A3R> 



RAP 
343 



TOtXnATAOCCIATGfcTCGACATOOGCTOGCACI A 'ITJlljGll. rCCI QOVTGTOCCT CCJtf!TCGCGTIGCCACCACCCC rAT GOQG TKl II ^IUXIXU^ TCOCTCACCCYOCC 
CBAGG 1305 

WO CI 

CTACCGGCAGCATGCCGCCCTCEATGTGCGTI 
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Xal-XabC 
Sgl-TcmO 
Sgl-TcmN 
Smy-MdmC 
Mxa-SafC 
Ser-EryG 
Spe-DauK 
Sal -DmpM 
Shy-RapM 
Sav-AveD 



174 VLDVAAGHG 
173 FVDL6GAR6 
331 IADLGGGDG 
64 VLBIGTFTG 
63 TLEVGVFTG 
85 VLDVGFGLG 
183 VL0VGGGKG 
208 WDIGGADG 
106 VLKVGCGMG 
71 VLDVGCGSG 

Motif I 



236 SGYDVILL 
234 PRADVFIV 
393 TGYDAYLF 
135 GAFDIVFV 
134 GTFDLAFI 
149 ETFDRVTS 
254 RKADAIIL 
269 GGGDLYVL 
155 VQGDAEEL 
124 GSFDAAWA 

Motif II 



267 ALNDDGMVTT 
263 ALT PGGAVL V 
423 IGDDDARLLI 
159 LVRPGGLVAI 
158 LVRPGGLIIL 
178 VLKPGGVIAI 
273 ALEPGGRILI 
298 AMPAHARI.LV 
194 ALRRGGALSH 
151 VLRPGGRLAV 

Motif III 
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SEQUENCE LISTING 

<110> The University of Queensland (All designated states, except U.S.) 
Robert, Birch (U.S. only) 

<120> Polynucleotides and polypeptides associated with antibiotic 
biosynthesis and uses therefor 

<130> 2454928/VPA 

<140> Not yet assigned 

<141> 2001-09-21 

<150> AU PR0277/00 

<151> 2000-09-21 

<150> AU PR0304/00 

<151> 2000-09-22 

<150> AU PR0320/00 

<151> 2000-09-22 

<160> 107 

<170> Patentln version 3.1 

<210> 1 

<211> 16511 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1230) . . (15632) 
<223> 



<220> 

<221> 5'UTR 
<222> (1)..(1229) 
<223> 



<220> 

<221> 3'UTR 

<222> (15633) . . (16511) 

<223> 



<400> 1 
gaattcatcg 


ccggccagac 


gccgtgcgaa 


agggtcatgg 


aacagcgcct gctcccgctc 


60 


gctttccagc 


gcccgcatgc 


ctgccaccca 


taaagcggtt 


ctttcgatat ctctcatgca 


120 


tacgctccgg 


ttcgtggtcg 


gcttgcgccg atgcatcata gatatgcatg actcgattcg 


180 


cggcaccgtg 


ccttgatggt 


ggctgcgaag 


cgaaaacaat 


aaccaaaggg tggtgctcga 


240 


cggctttact 


gtagcgacac 


cttgtccatc 


gccttacgga 


tggtctgatc cacgcaagca 


300 


aaagatgaga 


taaaccacat 


cagctgtcaa 


cgccgattta 


aatttgaccc actttccttt 


360 



-1 



WO 02/024736 



PCT/AU01/01190 



gaatcgtcga agtaaatctg acccacccgg ggtcttccat cgtcgggctg ctaggctgcg 420 

cagggcaaag cccgtcgcag cccagcagcc ctgcgccggc tcacgcccga agggcaggta 480 

gccgatctcg tcgaccacca gcagcttcgg ccctagtacc gcgcgattga agtagtcctt 540 

cagccggttc tgcgccttga ccgctgccag ttgcatcatc aggtcggccg cggtgatgaa 600 

acgtgccttg tgccccgcca tcaccgcacg ctggcacagc gccagggcga tgtgggtctc 660 

gccgacaccg ctggggccaa gcatcaccac gttctcggcg cgctcgacga aggtcaggtg 720 

gccgagctcg acgatctgcg ccttcgaggc gccgccggcc tgggcccagt cgaactgctc 780 

cagcgtcttg atggacggca tcctggcaag tcgcgtcagc accgtgcgct tgcgctcttc 840 

acgcgcgagc tgttcgcttg ccagcacctt ctccaggaag tagctggcat cctcgcacgc 900 

ggcggcctgt gcgagtgctt gccagtccga gctcaggcgt gccagcttca actgctcgca 960 

cagcgcggcg atgcgcgcac actgcaggtc catcacgcca cctccagcag ggtgtcatac 1020 

acggccagcg gatgctgcag gttttccact ggcagggcca ctggctgtcg taagggaagc 1080 

ggtgccttga gcgccggtgc ggacagtata acgacacgtt ccttggccaa gcgcactgtc 1140 

ggcacggcct tgctgatgcc gcccatgtag ccgcgcgcct ggatctcgcg tagtagcacc 1200 

acgctggccg ggatccatcg agggcgcgc ttg ccc aat gcg etc atg cag ata 1253 

Leu Pro Asn Ala Leu Met Gin lie 
1 5 

act ctt gta gec gtc cag ttt gca ggc gta ttg tta ggc gtc acc get 1301 
Thr Leu Val Ala Val Gin Phe Ala Gly Val Leu Leu Gly Val Thr Ala 
10 35 . 20 

cgc gcg gcg ate ccc aat aag gcg ggt atg aga cgc gca tgg ccg ccc 1349 
Arg Ala Ala He Pro Asn Lys Ala Gly Met Arg Arg Ala Trp Pro Pro 
25 30 35 40 

ttc ccg cag gcg tgc tgt cgc tct att get tac etc atg cag aga teg 1397 
Phe Pro Gin Ala Cys Cys Arg Ser He Ala Tyr Leu Met Gin Arg Ser 
45 50 55 

cca atg teg ccg tta cag caa acg ctg eta acc cgc etc gee agt gcg 1445 
Pro Met Ser Pro Leu Gin Gin Thr Leu Leu Thr Arg Leu Ala Ser Ala 
60 65 70 

gee gee tec egg aca atg ate gag ttt ccg cgt ccg gag cac gca teg 1493 
Ala Ala Ser Arg Thr Met He Glu Phe Pro Arg Pro Glu His Ala Ser 
75 80 B5 

cca caa tgt tgc gac gat gee gag ctt gcg cga ctg ate gtg cag ttg 1541 
Pro Gin Cys Cys Asp Asp Ala Glu Leu Ala Arg Leu He Val Gin Leu 
90 95 100 

teg gcg gga ctg caa ccg ctg gcg atg ccg ggt acc tac gtg ate att 1589 
Ser Ala Gly Leu Gin Pro Leu Ala Met Pro Gly Thr Tyr Val He He 
105 110 115 120 

gee gcg cca cat ggt ggt ttg ttc gcg gca gee ctg ctt gec tgt ttg 1637 
Ala Ala Pro His Gly Gly Leu Phe Ala Ala Ala Leu Leu Ala Cye Leu 
125 130 135 

-2- 
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cat gcc aac ctg gtg gcg gtg ccg ttt cca ctg gat gtt get cag cca 1685 
His Ala Asn Leu Val Ala Val Pro Phe Pro Leu Asp Val Ala Gin Pro 
140 145 150 

aat gag egg gaa cag gcc agg ctg gag acg ate cac gca caa ttg atg 1733 
Asn Glu Arg Glu Gin Ala Arg Leu Glu Thr lie His Ala Gin Leu Met 
155 160 165 

gag cat ggc aat gta gcg gtt ctg ctt gac gat gtc gcc gat cgc agt 1781 
Glu His Gly Asn Val Ala Val Leu Leu Asp Asp Val Ala Asp Arg Ser 
170 175 180 

gcc ttc gcg cgc atg gcg cat get gcg ggc acc ttc ctg gcg ace ttc 1829 
Ala Phe Ala Arg Met Ala His Ala Ala Gly Thr Phe Leu Ala Thr Phe 
185 190 195 200 

gcc gat eta aag cgc gaa teg acc age gcc tec ttg tgc ccg gcg teg 1877 
Ala Asp Leu Lys Arg Glu Ser Thr Ser Ala Ser Leu Cys Pro Ala Ser 
205 210 215 



cct teg gac gcc gcc ttg ctg ttg ttt acc tct ggt tec teg ggt gag 
Pro Ser Asp Ala Ala Leu Leu Leu Phe Thr Ser Gly Ser Ser Gly Glu 
220 225 230 



tgg ctt tct ccc gcg cac aac ttc ggc ctg cat ttc ggc ttg ctg gca 
Trp Leu Ser Pro Ala His Asn Phe Gly Leu His Phe Gly Leu Leu Ala 

2C5 270 2?5 ' 280 



1925 



tec aag ggc ate ctg ctt age cac cgc aac ctg cat cat cag ate cag 1973 
Ser Lys Gly lie Leu Leu Ser His Arg Asn Leu His His Gin He Gin 
235 240 245 

get ggc ate egg cag tgg age ttg gac gag cat age cat gtg gtg acc 2021 
Ala Gly He Arg Gin Trp Ser Leu Asp Glu His Ser His Val Val Thr 
250 255 260 



2069 



ccc tgg ttc agt ggc gcg acg gtc agt ttc ate cat ccg cac agt tat 2117 
Pro Trp Phe Ser Gly Ala Thr Val Ser Phe He His Pro His Ser Tyr 
285 290 295 

atg aaa cga ccc ggc ttc tgg ctg gag acg gtt gcg get aga gac gcc 2165 
Met Lys Arg Pro Gly Phe Trp Leu Glu Thr Val Ala Ala Arg Asp Ala 
300 305 310 

acg cac atg gcc gcg ccg aac ttc gcg ttc gac tac tgc tgc gac tgg 2213 
Thr His Met Ala Ala Pro Asn Phe Ala Phe Asp Tyr Cys Cys Asp Trp 
315 320 325 

gtg atg gtc gag cag ctt ccg ccg tct gcg ttg tct acg ctt acg cat 2261 
Val Met Val Glu Gin Leu Pro Pro Ser Ala Leu Ser Thr Leu Thr His 
330 335 340 

ate gtg tgt ggc ggc gag ccg gtg cgc gcc teg acc atg cag cgc ttc 2309 
He Val Cys Gly Gly Glu Pro Val Arg Ala Ser Thr Met Gin Arg Phe 
345 350 355 360 

ttc gag aaa ttc gcc gga etc ggt gcg cgt acg cag act ttc atg ccg 2357 
Phe Glu Lys Phe Ala Gly Leu Gly Ala Arg Thr Gin Thr Phe Met Pro 
365 370 375 

cac ttc ggc ttg tct gaa acc ggt gcg ctg agt acc ttg gac gag gcg 2405 
His Phe Gly Leu Ser Glu Thr Gly Ala Leu Ser Thr Leu Asp Glu Ala 
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380 385 390 

ccc caa cag cgc gtc ttg gaa eta gat gec gac gec ttg aac aaa cgc 2453 
Pro Gin Gin Arg Val Leu Glu Leu Asp Ala Asp Ala Leu Asn Lys Arg 
395 400 405 

aag cgc gtg gcg gca ggg gcg age cag gcg cgt gtg aca gtg etc aat 2501 
Lys Arg Val Ala Ala Gly Ala Ser Gin Ala Arg Val Thr Val Leu Asn 
410 415 420 

tgc ggc gec gtc gac caa gat gtg gag ttg cgt ate gtc tgt cct gaa 2549 
Cys Gly Ala Val Asp Gin Asp Val Glu Leu Arg He Val Cys Pro Glu 
425 430 435 440 

ggc gag acg ttg tgc aga cca gat gag ate ggc gaa ata tgg gta aag 2597 
Gly Glu Thr Leu Cys Arg Pro Asp Glu He Gly Glu He Trp Val Lys 
445 450 455 

teg cct gcg ate gec cgt ggc tac ctg ttt gcg aag ccc gee gat cag 2645 
Ser Pro Ala He Ala Arg Gly Tyr Leu Phe Ala Lys Pro Ala Asp Gin 
460 465 470 

cga cag ttc aac tgc age ate cgt cat ace gac gat age ggt tac ttt 2693 
Arg Gin Phe Asn Cys Ser He Arg His Thr Asp Asp Ser Gly Tyr Phe 
475 480 485 

cgt ace ggc gac ctg ggt ttc att gec gat ggc tgt ctg tat gtc ace 2741 
Arg Thr Gly Asp Leu Gly Phe He Ala Asp Gly Cys Leu Tyr Val Thr 
490 495 500 

gga agg gta aag gag gtg ctg ate ata cgc ggt aag aat cat tac ccc 2789 
Gly Arg Val Lys Glu Val Leu He He Arg Gly Lys Asn His Tyr Pro 
505 510 515 520 

gca cat ate gaa gee teg ate gee get acc gca teg cct ggc gcg ctg 2837 

Ala His He Giu Ale £cr He .Ma Ala Thr £2a Ser Pro Gly Al?. Leu • ••- - ' - - - ■ -** 

525 530 535 

atg ccg gtg gtg ttc age ate gag egg cag gac gag gag cgc gta get 2885 
Met Pro Val Val Phe Ser He Glu Arg Gin Asp Glu Glu Arg Val Ala 
540 545 550 

gcg gtg ate gee gtc aat cac ccg tgg acg ccg gca gca tgc gee gcg 2933 
Ala Val He Ala Val Asn His Pro Trp Thr Pro Ala Ala Cys Ala Ala 
555 560 565 

cag gca cac aag ate egg caa cag gta gee gac cag cat gga gtc gee 2981 
Gin Ala His Lys He Arg Gin Gin Val Ala Asp Gin His Gly Val Ala 
570 575 580 

ctg gcg gag eta gec ttt gec gaa cac egg cac gtg ttc ggc acc tat 3029 
Leu Ala Glu Leu Ala Phe Ala Glu HiB Arg His Val Phe Gly Thr Tyr 
585 590 595 600 

ccg ggc aaa ctg aag egg cgc eta gtc aag gaa gec tat gtc aac ggc 3077 
Pro Gly Lys Leu Lys Arg Arg Leu Val Lys Glu Ala Tyr Val Asn Gly 
605 610 615 

cag ctg ccg ttg tta tgg cat gag ggt aag aac egg gac gta cca gcg 3125 
Gin Leu Pro Leu Leu Trp His Glu Gly Lys Asn Arg Asp Val Pro Ala 
620 625 630 

gec gee gcg gac gat egg cag gcg caa cac gtg gcg gac ctg tgt egg 3173 
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Ala Ala Ala Asp Asp Arg Gin Ala Gin His Val Ala Asp Leu Cys Arg 
635 640 645 

aag gtc ttt ttg ccg gtg ttg ggt gtc gcg ccg ccg cat gcc caa tgg 
Lys Val Phe Leu Pro Val Leu Gly Val Ala Pro Pro His Ala Gin Trp 
650 655 660 



gga eg ggc gag aac act gaa gca ttg tgg teg ttc ctg egg age gac 

Gly Ala Gly Glu Asn Thr Glu Ala Leu Trp Ser Phe Leu Arg Ser Asp 
745 750 755 760 



3221 



ccg ctg tgc gaa ctg gcg ctg gat teg etc caa tgc gtg cgt ctt gcc 3269 
Pro Leu Cys Glu Leu Ala Leu Asp Ser Leu Gin Cys Val Arg Leu Ala 
665 670 675 680 

ggt gcc ate gaa gag tgc tac ggc gtg cct ttc gaa ccc acg ttg eta 3317 
Gly Ala He Glu Glu Cys Tyr Gly Val Pro Phe Glu Pro Thr Leu Leu 
685 690 695 

ttc aag ctt gag acg gtc ggg gca ate gcc gaa tat gtc ctg gcg cac 3365 
Phe Lys Leu Glu Thr Val Gly Ala He Ala Glu Tyr Val Leu Ala His 
700 705 710 

gga cgt cag gcg ccc acg ccg acg cgt gcg ccg gtg gca age aca aca 3413 
Gly Arg Gin Ala Pro Thr Pro Thr Arg Ala Pro Val Ala Ser Thr Thr 
715 720 725 

tgc tea gag gaa ccg ate gcc att gtg gcg atg cae tgt gag gtg ccc f 3461 
Cys Ser Glu Glu Pro He Ala He Val Ala Met His Cys Glu Val Pro 
730 735 740 



3509 



gtc aac gcg ate egg ccg ate gaa tea acg cgc ccg gac tta tgg gca 3557 
Val Asn Ala He Arg Pro He Glu Ser Thr Arg Pro Asp Leu Trp Ala 
765 770 775 

gcg atg -cgc gcc cat ccc ggc etc gcg ggc gaa cag ctg ccg cgc tat *h0S 
Ala Met Arg Ala Tyr Pro Gly Leu Ala Gly Glu Gin Leu Pro Arg Tyr 
780 785 790 

gcg ggt ttc etc gac gac gtt gat get ttc gat get gcg ttt ttc ggt 3653 
Ala Gly Phe Leu Asp Asp Val Asp Ala Phe Asp Ala Ala Phe Phe Gly 
795 - 800 805 

ate teg cgt cgc gag gcc gaa tgc atg gac ccg cag cag cgc aaa gtg 3701 
He Ser Arg Arg Glu Ala Glu Cys Met Asp Pro Gin Gin Arg Lys Val 
810 815 820 

ctg gag atg gtg tgg aag ctg ate gag caa gcc ggt cac gat ccg ctg 3749 
Leu Glu Met Val Trp Lys Leu He Glu Gin Ala Gly His Asp Pro Leu 
825 830 835 840 

tec tgg ggc ggc cag ccg gtc ggc ctg ttc gtg ggt gcg cat acg tec 3797 
Ser Trp Gly Gly Gin Pro Val Gly Leu Phe Val Gly Ala His Thr Ser 
845 850 855 

gac tat ggc gag ctg ctg gcg age cag ccg caa ctg atg gcc caa tgt 3845 
Asp Tyr Gly Glu Leu Leu Ala Ser Gin Pro Gin Leu Met Ala Gin Cys 
860 865 870 

ggc get tac ate gat teg ggt teg cat ttg acc atg att ccg aac egg 3893 
Gly Ala Tyr He Asp Ser Gly Ser His Leu Thr Met He Pro Asn Arg 
BIS 880 885 
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get teg cgc tgg ttc aat ttc ace ggc ccc age gaa gta ate aac age 3941 
Ala Ser Arg Trp Phe Asn Phe Thr Gly Pro Ser Glu Val He Asn Ser 

890 " 895 '•' 'V' 900 "< '", ' 7 v ( '" ' ;: 

get tgc tec age teg ctg gtg gcg ctg cat egg gcg gtt caa teg ctg 3989 

Ala Cys Ser Ser Ser Leu Val Ala Leu His Arg Ala Val Gin Ser Leu 
905 910 915 920 

. • ■ ■ ' '■ \ 

cgc caa ggc gaa age agt gtc gee ctg gta etc ggc gtg aac ctt ate 4037 

Arg Gin Gly Glu Ser Ser Val Ala Leu Val Leu Gly Val Asn Leu lie 

925 930 935 

ctg get ccc aag gtg ctg tta gee agt gca age gcg ggc atg ctt teg 4085 
Leu Ala Pro Lys Val Leu Leu Ala Ser Ala Ser Ala Gly Met Leu Ser 
940 945 950 

ccc gat ggc cgc tgc aag acg ctt gac gee gee gee gat ggc ttc gtg 4133 
Pro Asp Gly Arg Cys Lys Thr Leu Asp Ala Ala Ala Asp Gly Phe Val 
955 960 965 

cgt teg gaa ggg ate gca ggg gtg ata ttg aag cca ctg gcg cag gcg , ( 4181\ 
Arg Ser Glu Gly He Ala Gly Val lie Leu Lys Pro Leu Ala Gin Ala / 

970 . 975- .'./,:/ .,. . /. 980./V . - s >--*:V\ \\-' 

ctg gee gat ggt gac agg gtc tac ggt eta gtc cgc ggc gtg gcg gtc 4229 ' 
Leu Ala Asp Gly Asp Arg Val Tyr Gly Leu Val Arg Gly Val Ala Val 
985 990 995 1000 

aac cat ggc ggc cgt tec aat tec ttg cgt get ccc aac gtc aac 4274 
Asn His Gly Gly Arg ' Ser Asn Ser Leu Arg Ala Pro Asn Val Asn 
1005 1010 1015 

gcg cag egg caa ctg ctg ate egg act tac cag gaa gee ggt gtc 4319 
Ala Gin Arg Gin Leu Leu He Arg Thr Tyr Gin Glu Ala Gly Val 
1020 1025 1030 

gag ccg gee age gtc ggt tat gtt gaa eta cac ggc act ggt ace 4364 
Glu Pro Ala Ser Val Gly Tyr Val Glu Leu His Gly Thr Gly Thr 
1035 1040 1045 

age ctg ggt gat ccg ate gaa ate cag gcg ctg aag gaa get ttc 4409 
Ser Leu Gly Asp Pro He Glu He Gin Ala Leu Lys Glu Ala Phe 
1050 . 1055 1060 

att gcg ttg ggg gca cag gee gee ccg tea aac tgc ggc ate ggt 4454 
He Ala Leu Gly Ala Gin Ala Ala Pro Ser Asn Cys Gly He Gly 
1065 1070 1075 

teg gtg aag tec gcg ctg ggc cat eta gaa gee get gca ggc ctg 4499 
Ser Val Lys Ser Ala Leu Gly His Leu Glu Ala Ala Ala Gly Leu 
1080 1085 1090 

acc ggc ctg ate aag gtg ctg ctg atg etc aag cac ggc gag cag 4544 
Thr Gly Leu He Lys Val Leu Leu Met Leu Lys His Gly Glu Gin 
1095 1100 1105 

gee ggc acg cgc cat ttc age acg etc aat ccg ctg ate gat ttg 4589 
Ala Gly Thr Arg His Phe Ser Thr Leu Asn Pro Leu He Asp Leu 
1110 1H5 H20 

cga ggt acg tea ttc gaa gtg gtg gcg cag cat cgc gca tgg ccg 4634 
Arg Gly Thr Ser Phe Glu Val Val Ala Gin His Arg Ala Trp Pro 
1125 H30 H35 
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teg cag gtc ggc att cac ggc aca etc ttg ccg cgt cgc gcg ggt 4679 
Ser Gin Val Gly lie His Gly Thr Leu Leu Pro Arg Arg Ala Gly 
1140 1145 1150 

ate age tea ttc ggc ttc ggc ggc gee aat gcg cat gcg ate gtg 4724 
lie Ser Ser Phe Gly Phe Gly Gly Ala Asn Ala His Ala lie Val 
1155 1160 1165 

gaa gag cat gtc att gee acg ccc ccc teg acg age tec get ggc 4769 
Glu Glu His Val He Ala Thr Pro Pro Ser Thr Ser Ser Ala Gly 
1170 1175 HBO 

ggc ccg gta ggt ate gtg ttg tea gee ggt agt gaa get gtc ttg 4814 
Gly Pro Val Gly He Val Leu Ser Ala Gly Ser Glu Ala Val Leu 
1185 1190 1195 

egg caa caa gtg ctg gee ttg tea gee tgg eta agg cag caa teg 4859 
Arg Gin Gin Val Leu Ala Leu Ser Ala Trp Leu Arg Gin Gin Ser 
1200 1205 1210 

ccg aca ccc gcg caa atg ate gat gtc gee tac acc tta cag gta 4904 
Pro Thr Pro Ala Gin Met He Asp Val Ala Tyr Thr Leu Gin Val 
1215 1220 1225 

gga cgc gca gee ctg teg cac agg ttg get ttt age gcg acg gac 4949 
Gly Arg Ala Ala Leu Ser His Arg Leu Ala Phe Ser Ala Thr Asp 
1230 1235 1240 

gee gag cag gca ttg gcg agg ctt gag ggt cgt ctg gcg ggc gtg 4994 
Ala Glu Gin Ala Leu Ala Arg Leu Glu Gly Arg Leu Ala Gly Val 
1245 1250 1255 

atg gat gee gag gtc cat cac ggt gtc gtg gat get gec gca acg 5039 
Met Asp Ala Glu Val His His Gly Val Val Asp Ala Ala Ala Thr 
126C 126E 1270 

get ccc gaa cat ggg egg cag acg cgc gaa ggt ctt gee ggt ttg 5084 
Ala Pro Glu His Gly Arg Gin Thr Arg Glu Gly Leu Ala Gly Leu 
1275 1280 1285 

ctg cga gee tgg act cag ggc gtg cgc gtc gat tgg teg gcg ctg 5129 
Leu Arg Ala Trp Thr Gin Gly Val Arg Val Asp Trp Ser Ala Leu 
1290 1295 1300 

tac ggc ata cag cga ccg cag cgc gtt age ctg cct gtc tac ccc 5174 
Tyr Gly He Gin Arg Pro Gin Arg Val Ser Leu Pro Val Tyr Pro 
1305 1310 1315 

ttc get agg gaa cgc tat tgg ctg ccc ggc cag get atg cat gee 5219 
Phe Ala Arg Glu Arg Tyr Trp Leu Pro Gly Gin Ala Met His Ala 
1320 1325 1330 

get gcg gac get cat ccg atg ctg cag ctg ttg cat gee aat gec 5264 
Ala Ala Asp Ala His Pro Met Leu Gin Leu Leu His Ala Asn Ala 
1335 1340 1345 

aaa eta cat cgc tac gee ttg cgt agg tec ggc tgc gca age ttt 5309 
Lys Leu His Arg Tyr Ala Leu Arg Arg Ser Gly Cys Ala Ser Phe 
1350 1355 1360 

ctt gtt gat cat tgc gtg gat ggt cga cag gta eta ccg gca gec 5354 
Leu Val Asp His Cys Val Asp Gly Arg Gin Val Leu Pro Ala Ala 
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1365 1370 1375 

gtg caa ctg gaa ttg gtg cgc gcc gtg gcg cag egg gtc atg gcg 5399 
Val Gin Leu Glu Leu Val Arg Ala Val Ala Gin Arg Val Met Ala 
1380 1385 1390 

cag gat gag ggt tgt ate gaa ctg gcg cag gtc gcc ttt ttg cat 5444 
Gin Asp Glu Gly Cys He Glu Leu Ala Gin Val Ala Phe Leu His 
1395 1400 1405 

ccc etc atg atg gag gag act gag ctg gag gtc gaa ate gaa ctg 5489 
Pro Leu Met Met Glu Glu Thr Glu Leu Glu Val Glu He Glu Leu 
1410 1415 1420 

teg aag age gat caa gat gag ttc gat ttc caa ctt cac gat get 5534 
Ser Lys Ser Asp Gin Asp Glu Phe Asp Phe Gin Leu His Asp Ala 
1425 1430 1435 

cac cgc caa cag gtc ttt age cag ggg cac gta cgt cgc egg gtc 5579 
His Arg Gin Gin Val Phe Ser Gin Gly His Val Arg Arg Arg Val 
1440 1445 1450 

tat acg gcg aca ccg cgc ttg gat tta gcc cag ctg caa aag ctt 5624 
Tyr Thr Ala Thr Pro Arg Leu Asp Leu Ala Gin Leu Gin Lys Leu 
1455 1460 1465 

tgt gcc gag cgc gtg ttg tec ggc gaa gac tgt tat gcg cac ttc 5669 
Cys Ala Glu Arg Val Leu Ser Gly Glu Asp Cys Tyr Ala His Phe 
1470 1475 1480 

acc gcc tgc gga ttg cag etc ggc gac egg etc aaa tec gtg caa 5714 
Thr Ala Cys Gly Leu Gin Leu Gly Asp Arg Leu Lys Ser Val Gin 
1485 1490 1495 

teg ate ggc tgc gga cgc aat ggc gag ggc gag ccg ate gca ttg 5759 

Ser lie Gly Cye Giy Arg Asn Giy Glu. Gly ^'2 Pre- He SJ.?. - 
1500 1505 1510 

ggt gtc ctg cgc ctg cca cca tea age gtt gaa gac age cat gtg 5804 
Gly Val Leu Arg Leu Pro Pro Ser Ser Val Glu Asp Ser His Val 
1515 1520 1525 

ctg cct cct age ctg ctt gat ggt gcc ttg cag tgt age ctt ggc 5849 
Leu Pro Pro Ser Leu Leu Asp Gly Ala Leu Gin Cys Ser Leu Gly 
1530 1535 1540 

ttg cag cgt gat gtc gag cac ate gcc atg cca tac acg ctg gag 5894 
Leu Gin Arg Asp Val Glu His He Ala Met Pro Tyr Thr Leu Glu 
1545 1550 1555 

egg atg acg gtg cat gcg ccg att cct ccc gag gcc tgg gtg ctg 5939 
Arg Met Thr Val His Ala Pro He Pro Pro Glu Ala Trp Val Leu 
1560 1565 1570 

ctg cgt cac ggc cat gca gcc aga cag tec ctg gac ate gat etc 5984 
Leu Arg His Gly His Ala Ala Arg Gin Ser Leu Asp He Asp Leu 
1575 1580 1585 

ctg gat tec gaa ggt agg gtc tgc gtc age etc ggc aat tac acc 6029 
Leu Asp Ser Glu Gly Arg Val Cys Val Ser Leu Gly Asn Tyr Thr 
1590 1595 1600 

ggc cgt gca ccg aaa gcc gtt tec gcc gtc agg gcg ctt gtc ttg 6074 
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Gly Arg Ala Pro Lys Ala Val Ser Ala Val Arg Ala Leu Val Leu 
1605 1610 1615 



gca ccg gtc tgg caa 
Ala Pro Val Trp Gin 
1620 



gcg ttg acc gaa acg 
Ala Leu Thr Glu Thr 
1625 



gcg ccg gca tgg ccc 
Ala Pro Ala Trp Pro 
1630 



6119 



gat ccg gcc gaa cgc 
Asp Pro Ala Glu Arg 
1635 



ate gtt acg gta gga gac gat gca tgg cgt 
He Val Thr Val Gly Asp Asp Ala Trp Arg 
1640 1645 



6164 



agt cac ttc ggt ttc gac gag ccg gcc ttg 
Ser His Phe Gly Phe Asp Glu Pro Ala Leu 
1650 1655 



tec ctg gag gac age 
Ser Leu Glu Asp Ser 
1660 



6209 



gtc gaa gtc ate gcg 
Val Glu Val He Ala 
1665 



acg cga ctg ggc cag 
Thr Arg Leu Gly Gin 
1670 



age ggc aag ttc gat 
Ser Gly Lys Phe Asp 
1675 



6254 



cat eta gtc tgg ate gtg ccg ata gcc gag agt gaa acc gat att 
His Leu Val Trp He Val Pro He Ala Glu Ser Glu Thr Asp He 
16B0 1685 1690 



6299 



gca gcg caa ggt tea 
Ala Ala Gin Gly Ser 
1695 



gcg gcg ate gcc ggt ttc egg ttg gtc aag 
Ala Ala He Ala Gly Phe Arg Leu Val Lys 
1700 1705 



6344 



gcg ttg ctt gcg ttg ggc tat gcg cat cgc 
Ala Leu Leu Ala Leu Gly Tyr Ala His Arg 
1710 1715 



ccg ctg ggt etc acc 
Pro Leu Gly Leu Thr 
1720 



6389 



gtg ctg act cgc caa gcc ctt acg egg cag 
Val Leu Thr Arg Gin Ala Leu Thr Arg Gin 
1725 1730 



ccg teg cac gcg gca 
Pro Ser His Ala Ala 
1735 



6434 



gtg sac .ggc stc ggg acc; ccg 'jee aag gaa i '^.a Lcjc ■ :tg£ • 
Val His Gly Leu He Gly Thr Leu Ala Lys Glu Tyr Cys Asn Trp 
1740 1745 1750 



. ^ *>47;? 



aaa ate cgt ctg etc 
Lys He Arg Leu Leu 
1755 



gac ctg ccg age gta aaa tct tgg ccg caa 
Asp Leu Pro Ser Val Lys Ser Trp Pro Gin 
1760 1765 



6524 



tgg gag caa ttg egg teg ttg cct tgg cat 
Trp Glu Gin Leu Arg Ser Leu Pro Trp His 
1770 1775 



gcg cag ggc gaa gcc 
Ala Gin Gly Glu Ala 
1780 



6569 



ctg ate ggc cgt ggg 
Leu He Gly Arg Gly 
1785 



act tgt tgg tat egg egg cag ttg tgt gaa 
Thr Cys Trp Tyr Arg Arg Gin Leu Cys Glu 
1790 1795 



6614 



gtg ctg ccg ctg ccg teg ttg gaa ccg ccg ccg tac cgc gta ggc 
Val Leu Pro Leu Pro Ser Leu Glu Pro Pro Pro Tyr Arg Val Gly 
1800 1805 1810 



6659 



ggt gtc tac gtc gtg ate ggc ggc get ggc ggc ttg ggt gaa gta 
Gly Val Tyr Val Val He Gly Gly Ala Gly Gly Leu Gly Glu Val 
1815 1820 1825 



6704 



ttg age gaa cac ttg ate cgc acg tac gac gcg cag ctg ate tgg 
Leu Ser Glu His Leu He Arg Thr Tyr Asp Ala Gin Leu He Trp 
1830 1835 1840 
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ate ggg egg cgc gtg ctg gac gaa ggc att gcg cgc aag cag acc 6794 
He Gly Arg Arg Val Leu Asp Glu Gly He Ala Arg Lys Gin Thr 
1845 1850 1855 

egg ctt gcg teg ctg ggc cgc gca ccg cat tac ate tec gcg gac 6839 
Arg Leu Ala Ser Leu Gly Arg Ala Pro His Tyr He Ser Ala Asp 
I860 1865 1870 

gcg agt gac ccg get gee ctg cag gcg gca cat aat gag ate gtt 6884 
Ala Ser Asp Pro Ala Ala Leu Gin Ala Ala His Asn Glu He Val 
1875 1880 1885 

gcg ctg cat ggc cag ccc cat ggg etc ate eta age aac ate gtg 6929 
Ala Leu His Gly Gin Pro His Gly Leu He Leu Ser Asn He Val 
1890 1895 1900 

ctg aag gat gee agt ctg get cgt atg gag gaa gee gat ttc cgt 6974 
Leu Lys Asp Ala Ser Leu Ala Arg Met Glu Glu Ala Asp Phe Arg 
1905 1910 1915 

gac gtg ctg gee gcg aaa etc gac gtc age gtg tgt gcg gca cag 7019 
Asp Val Leu Ala Ala Lys Leu Asp Val Ser Val Cys Ala Ala Gin 
1920 1925 1930 

gtg ttc ggc acg gec ccc ctt gat ttc gtg ctg ttt ttt tct tec 7064 
Val Phe Gly Thr Ala Pro Leu Asp Phe Val Leu Phe Phe Ser Ser 
1935 1940 1945 

ate cag age act acc aag gcg gec ggg caa ggt aac tac gec gee 7109 
He Gin Ser Thr Thr Lys Ala Ala Gly Gin Gly Asn Tyr Ala Ala 
1950 1955 I960 

ggc tgc tgc tat gtc gac get ttc ggc gag eta tgg gcg cgc egg 7154 
Gly Cys Cys Tyr Val Asp Ala Phe Gly Glu Leu Trp Ala Arg Arg 
1965 1970 1975 

ggt ttg agg gta aag acc ate aac tgg ggc tac tgg ggc age gtg 7199 
Gly Leu Arg Val Lys Thr He Asn Trp Gly Tyr Trp Gly Ser Val 
1980 1985 1990 

ggc gtc gta gcg ggc gag gac tat cgc egg cgc atg gcg caa aaa 7244 
Gly Val Val Ala Gly Glu Asp Tyr Arg Arg Arg Met Ala Gin Lys 
1995 2000 2005 

cac atg get teg att gag ggt gec gaa gcg atg cag gtg ttg teg 7289 
His Met Ala Ser He Glu Gly Ala Glu Ala Met Gin Val Leu Ser 

2010 2015 2020 

cag ttg ttg tgt gcg ccg ttg caa egg ctt gee tac gtc aag ate 7334 
Gin Leu Leu Cys Ala Pro Leu Gin Arg Leu Ala Tyr Val Lys He 

2025 2030 2035 

gac gat get aac gca atg cgc get ctg ggc gta gta gag gac gag 7379 
Asp Asp Ala Asn Ala Met Arg Ala Leu Gly Val Val Glu Asp Glu 
2040 2045 2050 

age gtg caa ate cct gtg cac gca ccg gee gag cct ccc aga ggg 7424 
Ser Val Gin He Pro Val His Ala Pro Ala Glu Pro Pro Arg Gly 
2055 2060 2065 

cag cct ggt ccc gtg gtc gag ttg teg gtg aat ctg gat gec egg 7469 
Gin Pro Gly Pro Val Val Glu Leu Ser Val Asn Leu Asp Ala Arg 
2070 2075 2080 
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cgc gaa egg gaa act ttg ctg gcg gec tgg ctg ctt gag ttg ate 
Arg Glu Arg Glu Thr Leu Leu Ala Ala Trp Leu Leu Glu Leu He 
2085 2090 2095 

gag caa etc ggt ggt ttt ccg ccg gca agt ttc gac ate get acg 
Glu Gin Leu Gly Gly Phe Pro Pro Ala Ser Phe Asp He Ala Thr 
2100 2105 2110 



7514 



7559 



ctt gcg caa cgc ctg 
Leu Ala Gin Arg Leu 
2115 



cac ate gta ccc gee tat cga age tgg ctg 
His He Val Pro Ala Tyr Arg Ser Trp Leu 
2120 2125 



7604 



gaa cac age gtg egg atg etc ggc gtg tat 
Glu His Ser Val Arg Met Leu Gly Val Tyr 
2130 2135 



ggt tac etc aga gcg 
Gly Tyr Leu Arg Ala 
2140 



7649 



acg ggg gaa age cga 
Thr Gly Glu Ser Arg 
2145 



ttc gag ctg gee gac aag ccg ccc gat gat 
Phe Glu Leu Ala Asp Lys Pro Pro Asp Asp 
2150 2155 



7694 



gee agg ggt gee tgg aac gcg cat gtg cac 
Ala Arg Gly Ala Trp Asn Ala His Val His 
2160 2165 



gag gec age gtc gaa 
Glu Ala Ser Val Glu 
2170 



7739 



gec ggt gaa gag gca 
Ala Gly Glu Glu Ala 
2175 



cag egg cgt ctg etc gat cgc tgc atg egg 
Gin Arg Arg Leu Leu Asp Arg Cys Met Arg 
2180 2185 



7784 



gcg ttg ccg gcg gtc ctt cga ggc gaa cgc 
Ala Leu Pro Ala Val Leu Arg Gly Glu Arg 
2190 2195 



aag gee ace gaa ttg 
Lys Ala Thr Glu Leu 
2200 



7829 



ctg ttt ccg gaa ggt teg atg gcg tgg gtc gag ggt ate tac cag 
Leu Phe Pro Glu Gly Ser Met Ala Trp Val Glu Gly He Tyr Gin 
2205 2210 2215 



7874 



aac aac ccg ctt gee gat tac ttc aac gca caa eta gtc acg cga 
Asn Asn Pro Leu Ala Asp Tyr Phe Asn Ala Gin Leu Val Thr Arg 
2220 2225 2230 



7919 



ctg att gee tac ttg 
Leu He Ala Tyr Leu 
2235 

cgc ctg aag ctg tgc 
Arg Leu Lys Leu Cys 
2250 



aga cga cga eta gag teg acg cct acg gcg 
Arg Arg Arg Leu Glu Ser Thr Pro Thr Ala 
2240 2245 



gag ate ggc gee ggc 
Glu He Gly Ala Gly 
2255 



age ggt ggt act act 
Ser Gly Gly Thr Thr 
2260 



7964 



8009 



gca age gtg eta caa cag ttg cag gca tat ggt gag cat att gag 
Ala Ser Val Leu Gin Gin Leu Gin Ala Tyr Gly Glu His He Glu 
2265 2270 2275 



8054 



gaa tat etc tat ace gac ctg teg cct gtc ttc ctg cat cat gcg 
Glu Tyr Leu Tyr Thr Asp Leu Ser Pro Val Phe Leu His His Ala 
2280 2285 2290 



8099 



gaa aaa cac tat cag 
Glu Lys His Tyr Gin 
2295 



cca cga gcg cct tat ttg agg ace gee tgt 
Pro Arg Ala Pro Tyr Leu Arg Thr Ala Cys 
2300 2305 



8144 



ttc gac gta gcg cgc gcg ccg acg gcg cag 
Phe Asp Val Ala Arg Ala Pro Thr Ala Gin 
2310 2315 



gee ctg gaa tct ggc 
Ala Leu Glu Ser Gly 
2320 



8189 
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ggc tac gac gtg gtg att gcc gcc aac gta ctg cat get acg cgc 
Gly Tyr Asp Val Val lie Ma Ala Asn Val Leu His Ala Thr Arg 
2325 2330 2335 



8234 



gat ate gcc aag acc ttg cgc aat gcg aag gca etc etc aaa cct 
Asp He Ala Lys Thr Leu Arg Asn Ala Lys Ala Leu Leu Lys Pro 
2340 2345 2350 



8279 



ggc ggt ctg etc ttg 
Gly Gly Leu Leu Leu 
2355 



etc aac gaa gtg ate gag cgc age etc gtc 
Leu Asn Glu Val He Glu Arg Ser Leu Val 
2360 2365 



8324, 



ttg cac ctg act ttc 
Leu His Leu Thr Phe 
2370 



ggt ctg ctg gag age 
Gly Leu Leu Glu Ser 
2375 



tgg tgg ttg ccc cag 
Trp Trp Leu Pro Gin 
2380 



8369 



gac aag ate ttg cgc ctt gcc ggc teg ccg ttg ctg get tgc gcc 
Asp Lys He Leu Arg Leu Ala Gly Ser Pro Leu Leu Ala Cys Ala 
2385 2390 2395 



8414 



acc tgg cgc age ctg ctg gag get gag ggt ttt gcg ggg ctg age 
Thr Trp Arg Ser Leu Leu Glu Ala Glu Gly Phe Ala Gly Leu Ser 
2400 2405 2410 



8459 



gtg cac agg gcg caa 
Val His Arg Ala Gin 
2415 



ccc gat gcc ggg cag 
Pro Asp Ala Gly Gin 
2420 



gcc ate ate tgt gcc 
Ala He He Cys Ala 
2425 



8504 



tac age gat ggg ata 
Tyr Ser Asp Gly He 

2430 



gtg egg caa gcc agt 
Val Arg Gin Ala Ser 
2435 



acg ate gag gtt gcg 
Thr He Glu Val Ala 
2440 



8549 



egg aat gaa aaa gta 
Arg Asn Glu Lys Val 
244^ 



acc gtt ccg teg cag 
Thr Val Pro Ser Gin 

245C 



ccg gcg gaa gcc ggg 
Pro Ala Glu Ala Gly 
245b 



8594 



gaa teg ccg ctg gat ctg gtc aaa aaa ctg ctt gga cgc att ctg 
Glu Ser Pro Leu Asp Leu Val Lys Lys Leu Leu Gly Arg He Leu 
2460 2465 2470 



8639 



aaa atg gat ccg gcc aca etc gat acc age cac ccg ctg gag tac 
Lys Met Asp Pro Ala Thr Leu Asp Thr Ser His Pro Leu Glu Tyr 
2475 2480 2485 



8684 

• i ■ 



tac ggt gtc gat teg ate gtg gcg ate gaa 
Tyr Gly Val Asp Ser He Val Ala He Glu 
2490 2495 



ctg get atg gca ctg 
Leu Ala Met Ala Leu 
2500 



8729 



cgc gag aca ttc ccg ggt ttt gaa gtc age 
Arg Glu Thr Phe Pro Gly Phe Glu Val Ser 
2505 2510 



gag ctg ttt gaa acg 
Glu Leu Phe Glu Thr 
2515 



8774 



caa tec ate gat acc ttg ttg ggc tct ctt gag cag get cct etc 
Gin Ser He Asp Thr Leu Leu Gly Ser Leu Glu Gin Ala Pro Leu 
2520 2525 2530 



8819 



ctt get acc etc aca get ccg ccg caa caa gac atg ctg cag cag 
Leu Ala Thr Leu Thr Ala Pro Pro Gin Gin Asp Met Leu Gin Gin 
2535 2540 2545 



8864 



ctg aaa caa ctg ctg gcg cgt acg ctg aag ctg gac att acg cag 
Leu Lys Gin Leu Leu Ala Arg Thr Leu Lys Leu Asp He Thr Gin 



8909 
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2550 2555 2560 

ate gac acg age aag acg ctg gag age tat ggt gtc gac tec ate 8954 
He Asp Thr Ser Lye Thr Leu Glu Ser Tyr Gly Val Asp Ser He 
2565 2570 2575 

gtc ate ate gaa tta gec aac gee ttg cgt gag cgc tat ccg age 8999 
Val He He Glu Leu Ala Asn Ala Leu Arg Glu Arg Tyr Pro Ser 
2580 2585 2590 

ttg gac gcg tea cag ctg atg gaa acc tta teg ate gac egg ctg 9044 
Leu Asp Ala Ser Gin Leu Met Glu Thr Leu Ser He Asp Arg Leu 
2595 2600 2605 

gtt gee caa tgg cag gca acg gag ccc gee gta ccg gca gag cca 9089 
Val Ala Gin Trp Gin Ala Thr Glu Pro Ala Val Pro Ala Glu Pro 
2610 2615 2620 

aca gcg gaa ccg ccg gta gee gac gaa gac gee get gec ate ate 9134 
Thr Ala Glu Pro Pro Val Ala Asp Glu Asp Ala Ala Ala He He 
2625 2630 2635 

gga ctg gec ggc cgc ttt cca ggc gcg gac acg ttg gag gag ttc 9179 
Gly Leu Ala Gly Arg Phe Pro Gly Ala Asp Thr Leu Glu Glu Phe 
2640 2645 2650 

tgg aac aac ctg cgc aac ggc caa age agt atg gga gag gtg cca 9224 
Trp Asn Asn Leu Arg Asn Gly Gin Ser Ser Met Gly Glu Val Pro 
2655 2660 2665 

ggc gag cgc tgg gat cac cag cac tac ttc gac agt gaa cgc cag 9269 
Gly Glu Arg Trp Asp His Gin His Tyr Phe Asp Ser Glu Arg Gin 
2670 2675 2680 

qca ccg ggc aag acg tat aqc cgc tgg ggt gcg ttt ctg agg gac 9314 

Ala Pre Gly Lyc Thr Tyr Scr Arg Trp Gly Aid The Uxz Arg Acp 
2685 2690 2695 

ata gac ggc ttc gat gca gec ttc ttt gaa tgg ccc gac age gtc 9359 
He Asp Gly Phe Asp Ala Ala Phe Phe Glu Trp Pro Asp Ser Val 

2700 2705 2710 

gcg ctg gaa teg gat ccg caa gcg egg ata ttt eta gag cag gee 9404 
Ala Leu Glu Ser Asp Pro Gin Ala Arg He Phe Leu Glu Gin Ala 

2715 2720 2725 

tat gec ggg ate gaa gat gec ggc tac acg cct ggc teg etc age 9449 
Tyr Ala Gly He Glu Asp Ala Gly Tyr Thr Pro Gly Ser Leu Ser 
2730 2735 2740 

aag age caa cgc gta ggt gta ttc gta ggt gtg atg aat ggt tac 94 94 

Lys Ser Gin Arg Val Gly Val Phe Val Gly Val Met Asn Gly Tyr 
2745 2750 2755 

tac age ggc gga gcg cgc ttc tgg caa ate gec aac cgc gtg teg 9539 
Tyr Ser Gly Gly Ala Arg Phe Trp Gin He Ala Asn Arg Val Ser 
2760 2765 2770 

tac cag ttc gat ttt cgc ggg cca age ctg gcg gtg gat acc gee 9584 
Tyr Gin Phe Asp Phe Arg Gly Pro Ser Leu Ala Val Asp Thr Ala 
2775 2780 2785 

tgt teg get teg etc acc gcg ate cac ctg gcg ctg gaa age ctg 962 9 

Cys Ser Ala Ser Leu Thr Ala He His Leu Ala Leu Glu Ser Leu 
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2790 



2795 



2800 



cgc age ggc agt tgc gag gtc gca ctg gec ggt ggc gtg aat ctg 
Arg Ser Gly Ser Cys Glu Val Ala Leu Ala Gly Gly Val Asn Leu 
2805 2810 2815 



9674 



ctg gtc gat ccg cag caa tat ctt aat ttg get ggc gec gcg atg 
Leu Val Asp Pro Gin Gin Tyr Leu Asn Leu Ala Gly Ala Ala Met 
2820 2825 2830 



9719 



etc tec gec ggc gec age tgt egg ccg ttc ggc gag gee gcg gac 
Leu Ser Ala Gly Ala Ser Cys Arg Pro Phe Gly Glu Ala Ala Asp 
2835 2840 2845 



9764 



ggt ttc gtg gee ggc gaa gee tgc ggc gtg gtg ctg etc aag ccg 
Gly Phe Val Ala Gly Glu Ala Cys Gly Val Val Leu Leu Lys Pro 
2850 2855 2860 



9809 



etc aag caa gcg agg 
Leu Lys Gin Ala Arg 
2865 



gee gat ggc gat gtg ate cat gee gta ate 
Ala Asp Gly Asp Val lie His Ala Val lie 
2870 2875 



9854 



agg ggc age atg ate aat gee ggt ggg cac ace age gcg ttc tec 
Arg Gly Ser Met He Asn Ala Gly Gly His Thr Ser Ala Phe Ser 
2880 2885 2890 



9899 



teg cct aac cct gee gec cag gee gaa gtc gtg egg cag gec ttg 
Ser Pro Asn Pro Ala Ala Gin Ala Glu Val Val Arg Gin Ala Leu 
2895 2900 2905 



9944 



cag cgc gcg ggc gtg gcg ccc gat teg ate 
Gin Arg Ala Gly Val Ala Pro Asp Ser He 
2910 2915 



age tac ate gag gcg 
Ser Tyr He Glu Ala 
2920 



9989 



cat ggc ace ggc ace gta eta ggc gat gca gtg gag ttg ggt get 

Ills- Gly Thr Gly- Thr— Val Leu Gly Asp &le Val Glu Leu Gly Ala 
2925 2930 2935 



10034 



ttg aat aaa gtg ttc gac aag cgc gcg gcg cca tgc ccg ate ggc 
Leu Asn Lys Val Phe Asp Lys Arg Ala Ala Pro Cys Pro lie Gly 
2940 2945 2950 



10079 



teg ctg aag gcg aac 
Ser Leu Lys Ala Asn 
2955 



ate ggc cat gec gaa age gec gcg ggc ate 
He Gly HiB Ala Glu Ser Ala Ala Gly He 
2960 2965 



gee ggc ctg gec aag ctg gta ttg cag ttc agg cat ggc gag ttg 
Ala Gly Leu Ala Lys Leu Val Leu Gin Phe Arg His Gly Glu Leu 
2970 2975 2980 



10124 



10169 



gtg cct agt ctg aat gcg ttt ccc ttg aat ccc tat att gag ttc 
Val Pro Ser Leu Asn Ala Phe Pro Leu Asn Pro Tyr He Glu Phe 
2985 2990 2995 



10214 



ggt cgc ttc cag gta 
Gly Arg Phe Gin Val 
3000 



caa cag cag ccg gca 
Gin Gin Gin Pro Ala 
3005 



ccg tgg ccg cgc cgt 
Pro Trp Pro Arg Arg 
3010 



10259 



ggc gee cag ccg egg cgc gee ggg tta tct gee ttc ggt get ggc 
Gly Ala Gin Pro Arg Arg Ala Gly Leu Ser Ala Phe Gly Ala Gly 
3015 3020 3025 

gga teg aat gcg cac eta gtg gta gag gaa get ccg get atg get 
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Gly Ser Asn Ala His Leu Val Val Glu Glu Ala Pro Ala Met Ala 
3030 3035 3040 



ccc ggg gtc teg ate age gee age tct cca gee ttg ate gtg ctt 
Pro Gly Val Ser lie Ser Ala Ser Ser Pro Ala Leu lie Val Leu 
3045 3050 3055 



10394 



teg gcg cga acg ctg cct gee ttg caa cag cgt get cgc gat ctg 
Ser Ala Arg Thr Leu Pro Ala Leu Gin Gin Arg Ala Arg Asp Leu 
3060 3065 3070 



10439 



etc gtc tgg atg caa 
Leu Val Trp Met Gin 
3075 



gcg egg cag gtg gat gac gtc atg ctg gec 
Ala Arg Gin Val Asp Asp Val Met Leu Ala 
3080 3085 



10484 



gac gtt get tat acg ctg cac ttg ggc cgc 
Asp Val Ala Tyr Thr Leu His Leu Gly Arg 
N 3090 3095 



gtc gcg atg gag caa 
Val Ala Met Glu Gin 
3100 



10529 



cgc ctg get ttt acc get ggc teg get gee 
Arg Leu Ala Phe Thr Ala Gly Ser Ala Ala 
3105 3110 



gag ttg age gag aaa 
Glu Leu Ser Glu Lys 
3115 



10574 



tea cag get tac ctg ggc cat gcg att egg gee gac ate tat ctg 
Leu Gin Ala Tyr Leu Gly His Ala lie Arg Ala Asp lie Tyr Leu 
3120 3125 3130 



10619 



age gag gac acg ccc 
Ser Glu Asp Thr Pro 
3135 



ggc aaa ccg gca ggc get ccg ate gtg gee 
Gly Lys Pro Ala Gly Ala Pro lie Val Ala 
3140 3145 



10664 



gag gaa gat ctg etc acg ctg atg gat gec 
Glu Glu Asp Leu Leu Thr Leu Met Asp Ala 
3150 3155 



tgg ate gaa aag ggc 
Trp He Glu Lys Gly 
3160 



10709 



cag t^-ggtvcgt ttg 
Gin Tyr Gly Arg Leu 
3165 



ctg gag tac tgg arc aag ggc caa ccg ate 
Leu Glu Tyr Trp Thr Lys Gly Gin Pro lie 
3170 3175 



• 10754 - 



gac tgg aac aaa etc tat tgg cgc aag ctg 
Asp Trp Asn Lys Leu Tyr Trp Arg Lys Leu 
3180 3185 



tat gcg gac gga egg 
Tyr Ala Asp Gly Arg 
3190 



10799 



ccg egg egg ate age ctg ccc acc tat ccg ttc gag cac egg cgt 
Pro Arg Arg He Ser Leu Pro Thr Tyr Pro Phe Glu His Arg Arg 
3195 3200 3205 



10844 



tat tgg caa acg ccg gtg ccg ggc gag cga 
Tyr Trp Gin Thr Pro Val Pro Gly Glu Arg 
3210 3215 



age ctg cac gee acc 
Ser Leu His Ala Thr 
3220 



10889 



gcg cca get act egg 
Ala Pro Ala Thr Arg 
3225 



gaa acg gtt gcg gtt ggt gee atg ccg gat 
Glu Thr Val Ala Val Gly Ala Met Pro Asp 
3230 3235 



10934 



ccg gec ggc get acg gtg caa gee egg ttg tgc gec ttg tgc caa 
Pro Ala Gly Ala Thr Val Gin Ala Arg Leu Cys Ala Leu Cys Gin 
3240 3245 3250 



10979 



gtg ttg ttg ggc aaa ccg gtc acg gec cag atg gat ttc ttt gec 
Val Leu Leu Gly Lys Pro Val Thr Ala Gin Met Asp Phe Phe Ala 
3255 3260 3265 



11024 
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gtc ggc ggc cat teg gtg ctg gcg ate caa ttg gtc teg cgc ate 

Val Gly Gly His Ser Val Leu Ala lie Gin Leu Val Ser Arg He 
3270 3275 3280 

cgc aaa age ttc ggg gtg gag tat ccg gtc age get ttg ttc gaa 

Arg Lys Ser Phe Gly Val Glu Tyr Pro Val Ser Ala Leu Phe Glu 
3285 3290 3295 

teg gcg ctg ttg teg gac atg gcg egg cag ate gaa caa ttg egg 

Ser Ala Leu Leu Ser Asp Met Ala Arg Gin He Glu Gin Leu Arg 
3300 3305 3310 



gtg aac gga gtc gec aag cgc atg ccg gcg ttg ttg cct gec ggg 

Val Asn Gly Val Ala Lys Arg Met Pro Ala Leu Leu Pro Ala Gly 
3315 3320 3325 

cgc gtg ggc gcg att cct gcg act tat gca cag gag cgc eta tgg 

Arg Val Gly Ala He Pro Ala Thr Tyr Ala Gin Glu Arg Leu Trp 
3330 3335 3340 



etc gtc cac gaa cat atg agt gag caa cgc agt agt tac aac ate 
Leu Val His Glu His Met Ser Glu Gin Arg Ser Ser Tyr Asn He 
3345 3350 3355 



acc ttt gec atg cac ttc aga ggc gtc gac ttc cgt get gaa gcg 
Thr Phe Ala Met His Phe Arg Gly Val Asp Phe Arg Ala Glu Ala 
3360 3365 3370 



atg cgt gec gca ttg 
Met Arg Ala Ala Leu 
3375 

cgc aca cgc ttt ctt 
Arg Thr Arg Phe Leu 
3390 

get gec • teg -t-tg acg'.' 
Ala Ala Ser Leu Thr 
3405 



aac gcg ctg gtg gtg 
Asn Ala Leu Val Val 
3380 

teg gag gac ggg cag 
Ser Glu Asp Gly Gin 
3395 

ttg gag gtg ccg gta 
Leu Glu Val Pro Val 
3410 



egg cac gaa gtg ctg 
Arg His Glu Val Leu 
3385 

ctg caa cag gtg ate 
Leu Gin Gin Val He 
3400 

agi?. gag atg teg gtc 
Arg Glu Met Ser Val 
3415 



gag gag gtc gac ctg ctg ctg gee gcg age acg egg gag act ttc 
Glu Glu Val Asp Leu Leu Leu Ala Ala Ser Thr Arg Glu Thr Phe 
3420 3425 3430 

gat ctg egg cag ggg ccc ttg ttc aag gca cgc ate ctg cgc gtg 
Asp Leu Arg Gin Gly Pro Leu Phe Lys Ala Arg He Leu Arg Val 
3435 3440 3445 



gcg gec gat cac cat gtg gtg ttg age age ate cac cac ate att 
Ala Ala Asp His His Val Val Leu Ser Ser He His His He He 
3450 3455 3460 



tec gac ggc tgg teg ctg gga gtg ttc aac cgt gac ctg cac cag 
Ser Asp Gly Trp Ser Leu Gly Val Phe Asn Arg Asp Leu His Gin 
3465 3470 3475 



ctg tac gag gcg tgt ttg cgc ggc acg ccc ccc aca ctg ccg acg 
Leu Tyr Glu Ala Cys Leu Arg Gly Thr Pro Pro Thr Leu Pro Thr 
3480 3485 3490 



ctg gcg gtg cag tat gee gac tac gcg ctg tgg caa egg caa tgg 
Leu Ala Val Gin Tyr Ala Asp Tyr Ala Leu Trp Gin Arg Gin Trp 
3495 3500 3505 



11069 
11114 
11159 
11204 
11249 
11294 
11339 
11384 
11429 

11519 
11564 
11609 
11654 
11699 
11744 
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gag ctg gcg get ccg ctg teg tac tgg acg 
Glu Leu Ala Ala Pro Leu Ser Tyr Trp Thr 
3510 3515 



egg gca ctg gaa ggc 
Arg Ala Leu Glu Gly 
3520 



11789 



tac gac gac ggc ctg gac ttg ccc tac gac egg ccg cgc ggc gee 
Tyr Asp Asp Gly Leu Asp Leu Pro Tyr Asp Arg Pro Arg Gly Ala 
3525 3530 3535 



11834 



acg egg gcg tgg egg gca ggg ctg gtc aaa cac cgc tat ccg ccg 
Thr Arg Ala Trp Arg Ala Gly Leu Val Lys His Arg Tyr Pro Pro 
3540 3545 3550 



11879 



caa ctg gee cag cag ttg gcg gec tac age caa cag tac caa gcg 
Gin Leu Ala Gin Gin Leu Ala Ala Tyr Ser Gin Gin Tyr Gin Ala 
3555 3560 3565 



11924 



acg ctg ttc atg age ctg ctg gca ggc ctg 
Thr Leu Phe Met Ser Leu Leu Ala Gly Leu 
3570 3575 



gcg ttg gtg ctg ggc 
Ala Leu Val Leu Gly 
3580 



11969 



cgt tac gee gat cgc aag gac gtg tgc ate ggc gcg acg gtc tec 
Arg Tyr Ala Asp Arg Lys Asp Val Cys He Gly Ala Thr Val Ser 
3585 3590 3595 



12014 



ggc cgc gac cag ctg gag ctg gaa gag ctg ate ggc ttt ttc ate 
Gly Arg Asp Gin Leu Glu Leu Glu Glu Leu He Gly Phe Phe He 
3600 3605 3610 



12059 



aat att ttg ccg ctg egg gtg gac ctg teg ggg gat ccg tgc ctg 
Asn He Leu Pro Leu Arg Val Asp Leu Ser Gly Asp Pro Cys Leu 
3615 3620 3625 



12104 



gag gag gtg ctg ctg cgc acg cgt caa gtg 
Glu Glu Val Leu Leu Arg Thr Arg Gin Val 
3630 3635 



gta ctg gat ggc ttc 
Val Leu Asp Gly Phe 
3640 



12149 



gcg cac cag teg gtg 
Ala His Gin Ser Val 
3645 



ccg ttc gag cac gtg ttg cag gcg ctg egg 
Pro Phe Glu His Val Leu Gin Ala Leu Arg 
3650 3655 



12194 



cgt cag cgc gac agt age cag ate ccg ctg gtg ccg gtg atg ctg 
Arg Gin Arg Asp Ser Ser Gin He Pro Leu Val Pro Val Met Leu 
3660 3665 3670 



12239 



cga cac cag aac ttc ccg acg cag gag att ggc gat tgg ccc gag 
Arg His Gin Asn Phe Pro Thr Gin Glu He Gly Asp Trp Pro Glu 
3675 3680 3685 



12284 



gga gtg egg ctg acg cag atg gag ctg ggg ctg gac cgt age acg 
Gly Val Arg Leu Thr Gin Met Glu Leu Gly Leu Asp Arg Ser Thr 
3690 3695 3700 



12329 



ccg age gag ctg gat tgg cag ttc tac ggc gac ggc age teg ctg 
Pro Ser Glu Leu Asp Trp Gin Phe Tyr Gly Asp Gly Ser Ser Leu 
3705 3710 3715 



12374 



gag ctg acg ctg gaa tac gcg cag gac etc 
Glu Leu Thr Leu Glu Tyr Ala Gin Asp Leu 
3720 3725 



ttc gac gaa gcg acg 
Phe Asp Glu Ala Thr 
3730 



12419 



gtg egg egg atg ate 
Val Arg Arg Met He 
3735 



gca cac cac cag cag gcg ttg gag gcg atg 
Ala His His Gin Gin Ala Leu Glu Ala Met 
3740 3745 



12464 
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gtg age egg cca cag 
Val Ser Arg Pro Gin 
3750 



ctg egg gtg ggc aag 
Leu Arg val Gly Lys 
3755 



tgg gac atg ctg acg 
Trp Asp Met Leu Thr 
3760 



12509 



gee gaa gag cgc egg 
Ala Glu Glu Arg Arg 
3765 



ctg ttt gee gcg eta aat gcg aca ggt acg 
Leu Phe Ala Ala Leu A6n Ala Thr Gly Thr 
3770 3775 



12554 



cca egg gag tgg ccc 
Pro Arg Glu Trp Pro 
3780 



agt ctg gcg cag cag 
Ser Leu Ala Gin Gin 
3785 



ttc gaa egg cag gcg 
Phe Glu Arg Gin Ala 
3790 



12599 



cag gcg acg ccg cag 
Gin Ala Thr Pro Gin 
3795 



gee ata gca tgc gtg age gat ggg cag teg 
Ala He Ala Cys Val Ser Asp Gly Gin Ser 
3800 3805 



12644 



tgg age tat gcg cag ttg gag gcg cgc gee 
Trp Ser Tyr Ala Gin Leu Glu Ala Arg Ala 
3810 3815 



aac cag ctg gca cag 
Asn Gin Leu Ala Gin 
3820 



12689 



gcg ctg cgt ggg cag 
Ala Leu Arg Gly Gin 
3825 



ggc gcg ggc egg gac gtg egg gtg gcg gta 
Gly Ala Gly Arg Asp Val Arg Val Ala Val 
3830 3835 



12734 



cag agt gcg cgc acg 
Gin Ser Ala Arg Thr 
3840 



ccg gaa ctg ctg atg gee ttg ctg gcg ate 
Pro Glu Leu Leu Met Ala Leu Leu Ala lie 
3845 3850 



12779 



ttc aag gec ggt gca 
Phe Lys Ala Gly Ala 
3B55 



tgc tat gtg ccg ate gat ccg gec tac ccg 
Cys Tyr Val Pro He Asp Pro Ala Tyr Pro 
3860 3865 



12824 



gcg gec tac cgc gag cag ate ctg gec gag gtg cag gtg teg ate 
Ala Ala Tyr *rg Glu Gin He Leu Ala Glu Val Gin Val Ser He 

3 e7C 327E--S 3590 



12869 



gtg ctg gag caa gac 
Val Leu Glu Gin Asp 
3885 



gag ctg gcg ctg gac gag caa ggg cag ttc 
Glu Leu Ala Leu Asp Glu Gin Gly Gin Phe 
3890 3895 



12914 



cac aat ccg cgt tgg cgc gag caa gee ccg acg ccg ctg ggg ctg 
His Asn Pro Arg Trp Arg Glu Gin Ala Pro Thr Pro Leu Gly Leu 
3900 3905 3910 



12959 



agg gaa cat ccg ggc gac ctg gcg tgc gtg atg gtg acc tec ggc 
Arg Glu HiB Pro Gly Asp Leu Ala Cys Val Met Val Thr Ser Gly 
3915 3920 3925 



13004 



teg acc ggc egg ccc 
Ser Thr Gly Arg Pro 
3930 



aag ggc gtg atg gtg 
Lys Gly Val Met Val 
3935 



ccg tat gcg cag ctg 
Pro Tyr Ala Gin Leu 
3940 



13049 



cac aac tgg ctg cat 
His Asn Trp Leu His 
3945 



gca ggc tgg cag cgt tct gcg ttc gag gee 
Ala Gly Trp Gin Arg Ser Ala Phe Glu Ala 
3950 3955 



13094 



ggg gag egg gtg ctg cag aag acc teg ate gec ttt gcg gtg teg 
Gly Glu Arg Val Leu Gin Lys Thr Ser He Ala Phe Ala Val Ser 
3960 3965 3970 



13139 



gta aag gag ttg eta age ggg ctg ctg gcg ggg gtg gaa cag gtg 
Val Lys Glu Leu Leu Ser Gly Leu Leu Ala Gly Val Glu Gin Val 



13184 
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3975 



3980 



3985 



atg ctg ccg gac gag cag gtg aag gac age 
Met Leu Pro Asp Glu Gin Val Lys Asp Ser 
3990 3995 



ctg gcg ttg gcg egg 
Leu Ala Leu Ala Arg 
4000 



13229 



gcg att gag caa tgg 
Ala lie Glu Gin Trp 
4005 



cag gtg acg egg ctg tac eta gtg cca teg 
Gin Val Thr Arg Leu Tyr Leu Val Pro Ser 
4010 4015 



13274 



cac ctg cag gcg ctg ctg gac gcg acg caa 
His Leu Gin Ala Leu Leu Asp Ala Thr Gin 
4020 4025 



gga cga gac ggg eta 
Gly Arg Asp Gly Leu 
4030 



13319 



ctg cac teg ctg cgt 
Leu His Ser Leu Arg 
4035 



cac gtg gtg acg gcg ggg gaa gcg ttg ccg 
His Val Val Thr Ala Gly Glu Ala Leu Pro 
4040 4045 



13364 



tct gcg gtg cgc gaa acg gtg egg gtg cgt 
Ser Ala Val Arg Glu Thr Val Arg Val Arg 
4050 4055 



ctg cca cag gtg cag 
Leu Pro Gin Val Gin 
4060 



13409 



eta tgg aac aac tat ggc tgc acg gaa ctg aac gac gcg ace tac 
Leu Trp Asn Asn Tyr Gly Cys Thr Glu Leu Asn Asp Ala Thr Tyr 
4065 4070 4075 



13454 



cat egg teg gat acg gtg gcg cca gga acg ttt gtg ccg ate ggc 

His Arg Ser Asp Thr Val Ala Pro Gly Thr Phe Val Pro He Gly 

4080 4085 4090 

gca ccg ate gee aac ace gag gta tac gtg ctg gac egg cag ctg 

Ala Pro He Ala Asn Thr Glu Val Tyr Val Leu Asp Arg Gin Leu 



4095 



4100 



4105 



13499 



13544 



egg cag gtg ccg ate 
Arg Gin Val Pro He 



ggg gtg atg ggc gag 
Gly Val Met Gly Glu 

'iliS? 



ctg cac gta cac age 
Leu His Val His Ser 

47.20 



13589 



gtg ggg atg gcg cgc 
Val Gly Met Ala Arg 
4125 



ggc tac tgg aac egg ccg ggg ctg acg gec 
Gly Tyr Trp Asn Arg Pro Gly Leu Thr Ala 
4130 4135 



13634 



teg cgc ttc ate gcg cac ccg tat age gag 
Ser Arg Phe He Ala His Pro Tyr Ser Glu 
4140 4145 



gag ccg ggc aca egg 
Glu Pro Gly Thr Arg 
4150 



13679 



ctg tac aag acc ggt 
Leu Tyr Lys Thr Gly 
4155 



gac atg gta cgc egg ctg gcg gac ggg acg 
Asp Met Val Arg Arg Leu Ala Asp Gly Thr 
4160 4165 



13724 



ctg gaa tac ctg ggc 
Leu Glu Tyr Leu Gly 

4170 



cga cag gac ttc gag 
Arg Gin Asp Phe Glu 
4175 



gtc aag gtg cgc ggc 
Val Lys Val Arg Gly 
4180 



13769 



cac egg gtg gat acg 
His Arg Val Asp Thr 
4185 



egg cag gtg gag gcg 
Arg Gin Val Glu Ala 
4190 



gee ttg egg gcg cag 
Ala Leu Arg Ala Gin 
4195 



13814 



ccc gcg gtg gec gag gcg gtg gtg age ggt 
Pro Ala Val Ala Glu Ala Val Val Ser Gly 
4200 4205 



cac egg gtg gac ggg 
His Arg Val Asp Gly 

4210 



13859 



gac atg cag ttg gtg gec tat gtg gtg gcg cgt gaa ggg cag gca 
Asp Met Gin Leu Val Ala Tyr Val Val Ala Arg Glu Gly Gin Ala 



13904 
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4215 



4220 



4225 



ccg age gcg ggc gag ttg aaa caa cag ctg 
Pro Ser Ala Gly Glu Leu Lys Gin Gin Leu 
4230 4235 



teg gcg cag ttg ccg 
Ser Ala Gin Leu Pro 
4240 



13949 



ace tac atg ctg ccg 
Thr Tyr Met Leu Pro 
4245 



acc gtg tac cag tgg 
Thr Val Tyr Gin Trp 
4250 



ctg gag cag ttg ccg 
Leu Glu Gin Leu Pro 
4255 



13994 



egg ctg tec aac ggc aag ttg gac egg ttg 
Arg Leu Ser Asn Gly Lys Leu Asp Arg Leu 
4260 4265 



gcg ctg ccg gcg ccg 
Ala Leu Pro Ala Pro 
4270 



14039 



cag gtg gta cac gcg 
Gin Val Val Hie Ala 
4275 



cag gag tac gtc gcg cca cgc aac gag gec 
Gin Glu Tyr Val Ala Pro Arg Asn Glu Ala 
4280 4285 



14084 



gag caa egg ctg gcg gca ctg ttt gee gag 
Glu Gin Arg Leu Ala Ala Leu Phe Ala Glu 
4290 4295 

cag gtg ggc ate cac gac aac ttc etc gee 
Gin Val Gly lie His Asp Asn Phe Phe Ala 
4305 4310 



gtg ctg egg gtg gag 14129 
Val Leu Arg Val Glu 
4300 

ttg ggt ggg cac teg 14174 
Leu Gly Gly His Ser 
4315 



ctg tct gca teg caa ctg ate teg cgc ate cgc caa agt ttt cac 
Leu Ser Ala Ser Gin Leu He Ser Arg He Arg Gin Ser Phe His 
4320 4325 4330 



14219 



gtc gat ctg ccg ctg 
Val Asp Leu Pro Leu 
4335 



age egg ate ttc gag gca ccc acg ate gag 
Ser Arg He Phe Glu Ala Pro Thr He Glu 
4340 4345 



14264 



ggc ctg gtc agg cag eta gcg ttg cct agt 
Gly Leu Val Arg Gin- Leu Ala Leu to Ser 
4350 4355 



gaa ggc ggc gtg gee 
Glu Gly Gly val Ale 
4360 



14309 



age ate gee agg gta 
Ser He Ala Arg Val 
4365 



gcg cga aac egg acg 
Ala Arg Asn Arg Thr 
4370 



ate cca ttg teg ctg 
He Pro Leu Ser Leu 
4375 



ttc cag gaa cgc ctg tgg ttc gtg cac caa cac atg cct gag caa 
Phe Gin Glu Arg Leu Trp Phe Val His Gin His Met Pro Glu Gin 
4380 4385 4390 



14354 



14399 



cgc acc agt tac aac ggc acg etc gee ttg cgt ttg cgt ggt cct 
Arg Thr Ser Tyr Asn Gly Thr Leu Ala Leu Arg Leu Arg Gly Pro 
4395 4400 4405 



14444 



ttg teg gtg gaa gcg 
Leu Ser Val Glu Ala 
4410 



atg cgt gca gcg ctg 
Met Arg Ala Ala Leu 
4415 



cgt gcg tta gtg ctg 
Arg Ala Leu Val Leu 
4420 



14489 



cgc cac gaa ate ttg 
Arg His Glu He Leu 
4425 



cgt acc cgc ttc gtg ttg ccg acc ggt get 
Arg Thr Arg Phe Val Leu Pro Thr Gly Ala 
4430 4435 



14534 



age gag ccg gtg cag 
Ser Glu Pro Val Gin 
4440 



gtc att gac gag cac 
Val He Asp Glu His 
4445 



age gat ttc cag etc 
Ser Asp Phe Gin Leu 
4450 



tea gta cag eta gtc gag gat act gag ate gcg teg ctg atg gat 
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Ser Val Gin Leu Val Glu Asp Thr Glu He Ala Ser Leu Met Asp 
4455 4460 4465 



gaa ctg gca agt cat ate tac gac tta gec 
Glu Leu Ala Ser His He Tyr Asp Leu Ala 
4470 4475 



aac ggc ccg ctg ttc 
Asn Gly Pro Leu Phe 
4480 



att gca tgc ctt ttg caa ctg gat gag caa gaa cat gtg ctg eta 
He Ala Cys Leu Leu Gin Leu Asp Glu Gin Glu His Val Leu Leu 
4485 4490 4495 



ate ggc atg cat cac ctt ate tac gac get 
He Gly Met His His Leu He Tyr Asp Ala 
4500 4505 



tgg teg caa ttc acc 
Trp Ser Gin Phe Thr 
4510 



gtg atg aac cgc gat 
Val Met Asn Arg Asp 
4515 



eta cgc gtg ctg tat cac cgc cac etc gga 
Leu Arg Val Leu Tyr His Arg His Leu Gly 
4520 4525 



ctt gee ggc gga gat ctg ccg gaa tta ccg 
Leu Ala Gly Gly Asp Leu Pro Glu Leu Pro 
4530 4535 



tat gcg ate tgg caa 
Tyr Ala He Trp Gin 
4545 



cgc gee cag aac ctg 
Arg Ala Gin Asn Leu 
4550 



tat tgg cag get atg ttg cac gac tac gac 
Tyr Trp Gin Ala Met Leu His Asp Tyr Asp 
4560 4565 



ate caa tat gee gac 
He Gin Tyr Ala Asp 
4540 

gac gcg caa ctg gec 
Asp Ala Gin Leu Ala 
4555 

gac ggc ctg gag ctg 
Asp Gly Leu Glu Leu 
4570 



ccc tac gac tat ccg 
Pro Tyr Asp Tyr Pro 
4575 



cgt ccg cgc aat cgc acc tgg cac gca gcg 
Arg Pro Arg Asn Arg Thr Trp His Ala Ala 
4580 4585 



gtc-tcc aca coc acc tut ccg get gaa ctg gta cag cgc ttt gec 
Val Tyr Thr His Thr Tyr Pro Ala Glu Leu Val Gin Arg Phe Ala 
4590 4595 4600 



14669 



14714 



14759 



14804 



14849 



14894 



14939 



14984 



ggc ttc gta cag gcg 
Gly Phe Val Gin Ala 
4605 



cat cag teg acc ttg 
His Gin Ser Thr Leu 
4610 



gee age ttc gcg gtc gtg ttg aac aaa tac 
Ala Ser Phe Ala Val Val Leu Asn Lys Tyr 
4620 4625 



ttg tgc ate ggt acc 
Leu Cys He Gly Thr 
4635 



acc acg gca ggg cgc 
Thr Thr Ala Gly Arg 
4640 



gag aac ctg ate ggt ttc ttc ate aac ate 
Glu Asn Leu He Gly Phe Phe He Asn He 
4650 4655 



ttc ate ggg ctg ttg 
Phe He Gly Leu Leu 
4615 

acc ggc egg gac gac 
Thr Gly Arg Asp Asp 
4630 

acg cac ctg gag ctg 
Thr His Leu Glu Leu 
4645 

ttg cct ttg cgc ttg 
Leu Pro Leu Arg Leu 
4660 



cgc ttg gac ggc gat 
Arg Leu Asp Gly Asp 
4665 



ccg gac gtt gee gaa ate atg egg cga aca 
Pro Asp Val Ala Glu He Met Arg Arg Thr 
4670 4675 



egg ttg gtg gcg atg age gcg ttt gag aac cag gcg eta ccg ttc 
Arg Leu Val Ala Met Ser Ala Phe Glu Asn Gin Ala Leu Pro Phe 
4680 4685 4690 



15074 



15119 



15164 



15209 



15254 



15299 
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gag cac ctg etc aac gcc ctg cac aag caa cgt gac acc age egg 15344 
Glu His Leu Leu Asn Ala Leu His Lys Gin Arg Asp Thr Ser Arg 
4695 4700 4705 

att ccg eta gtt ccg gtg gtg atg cgt cat cag aac ttc ccg gac 15389 
He Pro Leu Val Pro Val Val Met Arg His Gin Asn Phe Pro Asp 
4710 4715 4720 

acg ate ggc gac tgg age gat ggc ate cgt acc gaa gtg ate cag 15434 
Thr He Gly Asp Trp Ser Asp Gly He Arg Thr Glu Val He Gin 
4725 4730 4735 

cgc gat ctg cgt gcc acc ccc aat gaa atg gac ctg caa ttc ttc 15479 
Arg Asp Leu Arg Ala Thr Pro Asn Glu Met Asp Leu Gin Phe Phe 
4740 4745 4750 

ggc gac ggt acg ggg ctt teg gtc aca gtg gaa tac gcg gcg gag 15524 
Gly Asp Gly Thr Gly Leu Ser Val Thr Val Glu Tyr Ala Ala Glu 
4755 4760 4765 

ctg ttc tea gaa gcg acc att cgc cgc ctg ate cac cat cac caa 15569 
Leu Phe Ser Glu Ala Thr He Arg Arg Leu He His His His Gin 

4770 4775 4780 

etc gtc ctg gag cag atg ttg gcg gcc cat gaa age gcc acg tgc 15614 
Leu Val Leu Glu Gin Met Leu Ala Ala His Glu Ser Ala Thr Cys 

4785 4790 4795 

ccc ttg gat gtt gcc gac tagcaaaagc cggccgccgt cacccgttca 15662 
Pro Leu Asp Val Ala Asp 
4800 

tegatagega gggcaatcat ggattcagcg ttacctacat ctgeatttae cttcgatctc 15722 

ttttacacca eggttaaege ctactatcgc actgccgcag teaaggegge gatcgaactg 157 82 

gggctsttcg acgtsgtggg^gcr.^caggsc cgaacteccg cage eatery cgaggcctgc Iz^ls- 

caggcgtcgc cgcgcggcat tcgcatcctt tgetattace tagtatcgat eggttttcta 15902 

cgccgcaacg gtggcctgtt ctacatagat cgcaacatgg ccatgtacct ggatcgtagt 15962 

tcgcccggct acctgggtgg cagcatcaag ttcctgctct cgccctacat catgagcgcc 16022 

ttcaccgatc tgaccgccgt agtcaggacc ggcaagatca acctggcgca ggacggcgtg 16082 

gtggcaccgg atcacccgca gtgggtggaa tttgcacgcg cgatggcacc gatgatggcg 16142 

ctgccctcgg cgttgatcgc caatatggtg tcgttgcccg ctgatcggcc gattcgtgtg 16202 

ctggacgtgg cagccggcca cggcctgttc ggcatcgcct tcgcgcagcg cttccgccag 16262 

gctgaagtga gcttcctgga ctgggacaac gtgctagacg tagcacgega aaacgcccag 16322 

gcggccaaag tggccgagcg agegegttte ctgcccggca aegcattega cctcgatta^ 16382 

ggcagegget aegaegtgat cttgttgacc aacttcctgc accatttcga tgaggtcgat 16442 

ggegagegea tcttggctaa gaegegegat gcgctgaacg aegaeggcat ggtgatcact 16502 

ttcgaattc 16511 



<210> 2 
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<211> 4B01 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 2 

Leu Pro Asn Ala Leu Met Gin lie Thr Leu Val Ala Val Gin Phe Ala 
1 5 10 15 



Gly Val Leu Leu Gly Val Thr Ala Arg Ala Ala lie Pro Asn Lys Ala 
20 25 30 



Gly Met Arg Arg Ala Trp Pro Pro Phe Pro Gin Ala Cys Cys Arg Ser 
35 40 45 



He Ala Tyr Leu Met Gin Arg Ser Pro Met Ser Pro Leu Gin Gin Thr 
50 55 60 

Leu Leu Thr Arg Leu Ala Ser Ala Ala Ala Ser Arg Thr Met He Glu 
65 70 75 80 



Phe Pro Arg Pro Glu His Ala Ser Pro Gin Cys Cys Asp Asp Ala Glu 
85 90 95 



Leu Ala Arg Leu He Val Gin Leu Ser Ala Gly Leu Gin Pro Leu Ala 
100 105 110 



Met Pro Gly Thr Tyx Val lie He Ala Ala Pro His Gly Gly Leu Phe 
115 120 125 



Ala Ala Ala Leu Leu Ala Cys Leu His Ala Asn Leu Val Ala Val Pro 
130 135 140 



Phe Pro Leu Asp Val Ala Gin Pro Asn Glu Arg Glu Gin Ala Arg Leu 
145 150 155 160 



Glu Thr He His Ala Gin Leu Met Glu His Gly Asn Val Ala Val Leu 
165 170 175 



Leu Asp Asp Val Ala Asp Arg Ser Ala Phe Ala Arg Met Ala His Ala 
180 185 190 



Ala Gly Thr Phe Leu Ala Thr Phe Ala Asp Leu Lys Arg Glu Ser Thr 
195 200 205 



Ser Ala Ser Leu Cys Pro Ala Ser Pro Ser Asp Ala Ala Leu Leu Leu 
210 215 220 



Phe Thr Ser Gly Ser Ser Gly Glu Ser Lys Gly He Leu Leu Ser His 
225 230 235 240 
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Arg Asn Leu His His Gin He Gin Ala Gly He Arg Gin Trp Ser Leu 
245 250 255 



Asp Glu His Ser His Val Val Thr Trp Leu Ser Pro Ala His Asn Phe 
260 265 270 



Gly Leu His Phe Gly Leu Leu Ala Pro Trp Phe Ser Gly Ala Thr Val 
275 280 285 



Ser Phe He His Pro His Ser Tyr Met Lys Arg Pro Gly Phe Trp Leu 
290 295 300 



Glu Thr Val Ala Ala Arg Asp Ala Thr His Met Ala Ala Pro Asn Phe 
305 310 315 320 



Ala Phe Asp Tyr Cys Cys Asp Trp Val Met Val Glu Gin Leu Pro Pro 
325 330 335 



Ser Ala Leu Ser Thr Leu Thr His He Val Cys Gly Gly Glu Pro Val 
340 345 350 



Arg Ala Ser Thr Met Gin Arg Phe Phe Glu Lys Phe Ala Gly Leu Gly 
355 360 365 



Ala Arg Thr Gin Thr Phe Met Pro His Phe Gly Leu Ser Glu Thr Gly 

37C ... 375 3£?0 



Ala Leu Ser Thr Leu Asp Glu Ala Pro Gin Gin Arg Val Leu Glu Leu 
385 390 395 400 



Asp Ala Asp Ala Leu Asn Lys Arg Lys Arg Val Ala Ala Gly Ala Ser 
405 4X0 415 



Gin Ala Arg Val Thr Val Leu Asn Cys Gly Ala Val Asp Gin Asp Val 
420 425 430 



Glu Leu Arg lie Val Cys Pro Glu Gly Glu Thr Leu Cys Arg Pro Asp 
435 440 445 



Glu He Gly Glu He Trp Val Lys Ser Pro Ala He Ala Arg Gly Tyr 
450 455 460 



Leu Phe Ala Lys Pro Ala Asp Gin Arg Gin Phe Asn Cys Ser He Arg 
465 470 475 480 



His Thr Asp Asp Ser Gly Tyr Phe Arg Thr Gly Asp Leu Gly Phe He 
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485 



490 



495 



Ala Asp Gly Cys Leu Tyr Val Thr Gly Arg Val Lys Glu Val Leu lie 
500 505 510 



lie Arg Gly Lys Asn His Tyr Pro Ala His He Glu Ala Ser lie Ala 
515 520 525 



Ala Thr Ala Ser Pro Gly Ala Leu Met Pro Val Val Phe Ser He Glu 
530 535 540 



Arg Gin Asp Glu Glu Arg Val Ala Ala Val He Ala Val Asn His Pro 
545 550 555 560 

Trp Thr Pro Ala Ala Cys Ala Ala Gin Ala His Lys He Arg Gin Gin 
565 570 575 



Val Ala Asp Gin His Gly Val Ala Leu Ala Glu Leu Ala Phe Ala Glu 
580 585 590 



His Arg His Val Phe Gly Thr Tyr Pro Gly Lys Leu Lys Arg Arg Leu 
595 600 605 



Val Lys Glu Ala Tyr Val Asn Gly Gin Leu Pro Leu Leu Trp His Glu 
610 615 620 



Gly Lys Asn Arg Asp Val Pro Ala Ala Ala Ala Asp A«?p Arg Gin Ala 
C2!i - . 630 635 640 



Gin His Val Ala Asp Leu Cys Arg Lys Val Phe Leu Pro Val Leu Gly 
645 650 655 



Val Ala Pro Pro His Ala Gin Trp Pro Leu Cys Glu Leu Ala Leu Asp 
660 665 670 



Ser Leu Gin Cys Val Arg Leu Ala Gly Ala He Glu Glu Cys Tyr Gly 
675 680 685 



Val Pro Phe Glu Pro Thr Leu Leu Phe Lys Leu Glu Thr Val Gly Ala 
690 695 700 



He Ala Glu Tyr Val Leu Ala His Gly Arg Gin Ala Pro Thr Pro Thr 
705 710 715 720 



Arg Ala Pro Val Ala Ser Thr Thr Cys Ser Glu Glu Pro He Ala He 
725 730 735 



Val Ala Met His Cys Glu Val Pro Gly Ala Gly Glu Asn Thr Glu Ala 
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740 745 750 



Leu Trp Ser Phe Leu Arg Ser Asp Val Asn Ala He Arg Pro He Glu 
755 760 765 



Ser Thr Arg Pro Asp Leu Trp Ala Ala Met Arg Ala Tyr Pro Gly Leu 
770 775 780 



Ala Gly Glu Gin Leu Pro Arg Tyr Ala Gly Phe Leu Asp Asp Val Asp 
785 790 795 800 



Ala Phe Asp Ala Ala Phe Phe Gly He Ser Arg Arg Glu Ala Glu Cys 
805 810 815 



Met Asp Pro Gin Gin Arg Lys Val Leu Glu Met Val Trp Lys Leu He 
820 825 B30 



Glu Gin Ala Gly His Asp Pro Leu Ser Trp Gly Gly Gin Pro Val Gly 
835 840 845 



Leu Phe Val Gly Ala His Thr Ser Asp Tyr Gly Glu Leu Leu Ala Ser 
850 355 860 



Gin Pro Gin Leu Met Ala Gin Cys Gly Ala Tyr He Asp Ser Gly Ser 
B65 870 875 880 



His Lgu Tnr Met lie Pre -Asn Arg Ala Ser Arg rrp Phe Asn Phe Thr 
885 890 895 



Gly Pro Ser Glu Val He Asn Ser Ala Cys Ser Ser Ser Leu Val Ala 
900 905 910 



Leu His Arg Ala Val Gin Ser Leu Arg Gin Gly Glu Ser Ser Val Ala 
915 920 925 



Leu Val Leu Gly Val Asn Leu He Leu Ala Pro Lys Val Leu Leu Ala 
930 935 940 



Ser Ala Ser Ala Gly Met Leu Ser Pro Asp Gly Arg Cys Lys Thr Leu 
945 950 955 960 



Asp Ala Ala Ala Asp Gly Phe Val Arg Ser Glu Gly He Ala Gly Val 
965 970 975 



He Leu Lys Pro Leu Ala Gin Ala Leu Ala Asp Gly Asp Arg Val Tyr 
980 985 990 
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Gly Leu Val Arg Gly Val Ala Val Asn His Gly Gly Arg Ser Asn Ser 
995 1000 1005 



Leu Arg Ala Pro Asn Val Asn Ala Gin Arg Gin Leu Leu lie Arg 
1010 1015 1020 



Thr Tyr Gin Glu Ala Gly Val Glu Pro Ala Ser Val Gly Tyr Val 
1025 1030 1035 



Glu Leu His Gly Thr Gly Thr Ser Leu Gly Asp Pro lie Glu lie 
1040 1045 1050 



Gin Ala Leu Lys Glu Ala Phe lie Ala Leu Gly Ala Gin Ala Ala 
1055 1060 1065 



Pro Ser Asn Cys Gly lie Gly Ser Val Lys Ser Ala Leu Gly His 
1070 1075 1080 



Leu Glu Ala Ala Ala Gly Leu Thr Gly Leu lie Lys Val Leu Leu 
1085 1090 1095 



Met Leu Lys His Gly Glu Gin Ala Gly Thr Arg His Phe Ser Thr 
1100 1105 1110 



Leu Asn Pro Leu He Asp Leu Arg Gly Thr Ser Phe Glu Val Val 
1115 1120 1125 



Ala Gin His Arg Ala Trp Pro Ser Gin Val Gly He His Gly Thr 
1130 1135 1140 



Leu Leu Pro Arg Arg Ala Gly He Ser Ser Phe Gly Phe Gly Gly 
1145 1150 1155 



Ala Asn Ala His Ala He Val Glu Glu His Val He Ala Thr Pro 
1160 1165 1170 



Pro Ser Thr Ser Ser Ala Gly Gly Pro Val Gly He Val Leu Ser 
1175 1180 1185 



Ala Gly Ser Glu Ala Val Leu Arg Gin Gin Val Leu Ala Leu Ser 
1190 1195 1200 



Ala Trp Leu Arg Gin Gin Ser Pro Thr Pro Ala Gin Met He Asp 
1205 1210 1215 



Val Ala Tyr Thr Leu Gin Val Gly Arg Ala Ala Leu Ser His Arg 
1220 1225 1230 
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Leu Ala Phe Ser Ala Thr Asp Ala Glu Gin Ala Leu Ala Arg Leu 
1235 1240 1245 



Glu Gly Arg Leu Ala Gly Val Met Asp Ala Glu Val His His Gly 
1250 1255 1260 



Val Val Asp Ala Ala Ala Thr Ala Pro Glu His Gly Arg Gin Thr 
1265 1270 1275 

Arg Glu Gly Leu Ala Gly Leu Leu Arg Ala Trp Thr Gin Gly Val 
1280 1285 1290 



Arg Val Asp Trp Ser Ala Leu Tyr Gly lie Gin Arg Pro Gin Arg 
1295 1300 1305 



Val Ser Leu Pro Val Tyr Pro Phe Ala Arg Glu Arg Tyr Trp Leu 
1310 1315 1320 



Pro Gly Gin Ala Met His Ala Ala Ala Asp Ala His Pro Met Leu 
1325 1330 1335 



Gin Leu Leu His Ala Asn Ala Lys Leu His Arg Tyr Ala Leu Arg 
1340 1345 1350 



Arg Ser Gly Cys Ala Ser Phe Leu Val Asp HiB Cys Val Asp Gly 
1355 1360 1365 



Arg Gin Val Leu Pro Ala Ala Val Gin Leu Glu Leu Val Arg Ala 
1370 1375 1380 



Val Ala Gin Arg Val Met Ala Gin Asp Glu Gly Cys lie Glu Leu 
1385 1390 1395 



Ala Gin Val Ala Phe Leu His Pro Leu Met Met Glu Glu Thr Glu 
1400 1405 1410 



Leu Glu Val Glu He Glu Leu Ser Lys Ser Asp Gin Asp Glu Phe 
1415 1420 1425 



Asp Phe Gin Leu His Asp Ala His Arg Gin Gin Val Phe Ser Gin 
1430 1435 1440 



Gly His Val Arg Arg Arg Val Tyr Thr Ala Thr Pro Arg Leu Asp 
1445 1450 1455 



Leu Ala Gin Leu Gin Lys Leu Cys Ala Glu Arg Val Leu Ser Gly 
1460 1465 1470 
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Glu Asp Cys Tyr Ala His Phe Thr Ala Cys Gly Leu Gin Leu Gly 
1475 1480 1485 



Asp Arg Leu Lys Ser Val Gin Ser He Gly Cys Gly Arg Asn Gly 
1490 , 1495 1500 



Glu Gly Glu Pro He Ala Leu Gly Val Leu Arg Leu Pro Pro Ser 
1505 1510 1515 



Ser Val Glu Asp Ser His Val Leu Pro Pro Ser lieu Leu Asp Gly 
1520 1525 1530 



Ala Leu Gin Cys Ser Leu Gly Leu Gin Arg Asp Val Glu His He 
1535 1540 1545 



Ala Met Pro Tyr Thr Leu Glu Arg Met Thr Val His Ala Pro He 
1550 1555 1560 ; 



Pro Pro Glu Ala Trp Val Leu Leu Arg His Gly His Ala Ala Arg 
1565 1570 1575 



Gin Ser Leu Asp He Asp Leu Leu Asp Ser Glu Gly Arg Val Cys 
1580 1585 1590 



Val Ser Leu Gly Asn Tyr Thr Gly Arg Ala Pro Lys Ala Val Ser 
1595 1600 1605 



Ala Val Arg Ala Leu Val Leu Ala Pro Val Trp Gin Ala Leu Thr 
1610 1615 1620 



Glu Thr Ala Pro Ala Trp Pro Asp Pro Ala Glu Arg He Val Thr 
1625 1630 1635 



Val Gly Asp Asp Ala Trp Arg Ser His Phe Gly Phe Asp Glu Pro 
1640 1645 1650 



Ala Leu Ser Leu Glu Asp Ser Val Glu Val He Ala Thr Arg Leu 
1655 1660 1665 



Gly Gin Ser Gly Lys Phe Asp His Leu Val Trp He Val Pro He 
1670 1675 1680 



Ala Glu Ser Glu Thr Asp He Ala Ala Gin Gly Ser Ala Ala He 
1685 1690 1695 



Ala Gly Phe Arg Leu Val Lys Ala Leu Leu Ala Leu Gly Tyr Ala 
1700 1705 1710 
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His Arg Pro Leu Gly Leu Thr Val Leu Thr Arg Gin Ala Leu Thr 
1715 1720 1725 



Arg Gin Pro Ser His Ala Ala Val His Gly Leu lie Gly Thr Leu 
1730 1735 1740 



Ala Lys Glu Tyr Cys Asn Trp Lys He Arg Leu Leu Asp Leu Pro 
1745 1750 1755 



Ser Val Lys Ser Trp Pro Gin Trp Glu Gin Leu Arg Ser Leu Pro 
1760 1765 1770 



Trp His Ala Gin Gly Glu Ala Leu He Gly Arg Gly Thr Cys Trp 
1775 1780 1785 



Tyr Arg Arg Gin Leu Cys Glu Val Leu Pro Leu Pro Ser Leu Glu 
1790 1795 1800 



Pro Pro Pro Tyr Arg Val Gly Gly Val Tyr Val Val He Gly Gly 
1805 1810 1815 



Ala Gly Gly Leu Gly Glu Val Leu Ser Glu His Leu He Arg Thr 
1820 1825 1830 



Tyr Asp Ala Gin Leu Tie Trp He Gly Arg Arg Val Leu Asp Glu 
183 c 1840 1845 



Gly He Ala Arg Lys Gin Thr Arg Leu Ala Ser Leu Gly Arg Ala 
1850 1855 I860 



Pro His Tyr He Ser Ala Asp Ala Ser Asp Pro Ala Ala Leu Gin 
1865 1870 1875 



Ala Ala His Asn Glu He Val Ala Leu His Gly Gin Pro His Gly 
1880 1885 1890 



Leu He Leu Ser Asn He Val Leu Lys Asp Ala Ser Leu Ala Arg 
1895 1900 1905 



: Met Glu Glu Ala Asp Phe Arg Asp Val Leu Ala Ala Lys Leu Asp 
1910 1915 1920 



Val Ser Val Cys Ala Ala Gin Val Phe Gly Thr Ala Pro Leu Asp 
1925 1930 1935 



Phe Val Leu Phe Phe Ser Ser He Gin Ser Thr Thr Lys Ala Ala 

-30- 



WO 02/024736 



PCT/AU01/01190 



1940 1945 1950 



Gly Gin Gly Asn Tyr Ala Ala Gly Cys Cys Tyr Val Asp Ala Phe 
1955 1960 1965 

Gly Glu Leu Trp Ala Arg Arg Gly Leu Arg Val Lys Thr lie Asn 
1970 1975 isfoo 



Trp Gly Tyr Trp Gly Ser Val Gly Val Val Ala Gly Glu Asp Tyr 
1985 1990 1995 



Arg Arg Arg Met Ala Gin Lys His Met Ala Ser lie Glu Gly Ala 
2000 2005 2010 



Glu Ala Met Gin Val Leu Ser Gin Leu Leu Cys Ala Pro Leu Gin 
2015 2020 2025 



Arg Leu Ala Tyr Val Lys lie Asp Asp Ala Asn Ala Met Arg Ala 
2030 2035 2040 



Leu Gly Val Val Glu Asp Glu Ser Val Gin He Pro Val His Ala 
2045 2050 2055 



Pro Ala Glu Pro Pro Arg Gly Gin Pro Gly Pro Val Val Glu Leu 
2060 2065 2070 . 



Ser Val Asn Leu Asp Ala Arg Arc; Glu Arg Glu Thr Le*t T eu Ala 
2Z1Z 2CCC 200^: 



Ala Trp Leu Leu Glu Leu He Glu Gin Leu Gly Gly Phe Pro Pro 
2090 2095 2100 



Ala Ser Phe Asp lie Ala Thr Leu Ala Gin Arg Leu His He Val 
2105 2110 2115 



Pro Ala Tyr Arg Ser Trp Leu Glu His Ser Val Arg Met Leu Gly 
2120 2125 2130 



Val Tyr Gly Tyr Leu Arg Ala Thr Gly Glu Ser Arg Phe Glu Leu 
2135 2140 2145 



Ala Asp Lys Pro Pro Asp Asp Ala Arg Gly Ala Trp Asn Ala His 
2150 2155 2160 



Val His Glu Ala Ser Val Glu Ala Gly Glu Glu Ala Gin Arg Arg 
2165 2170 2175 



Leu Leu Asp Arg Cys Met Arg Ala Leu Pro Ala Val Leu Arg Gly 
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2180 2185 2190 



Glu Arg Lys Ala Thr Glu Leu Leu Phe Pro Glu Gly Ser Met Ala 
2195 2200 2205 



Trp Val Glu Gly lie Tyr Gin Asn Asn Pro Leu Ala Asp Tyr Phe 
2210 2215 2220 



Asn Ala Gin Leu Val Thr Arg Leu He Ala Tyr Leu Arg Arg Arg 
2225 2230 2235 



Leu Glu Ser Thr Pro Thr Ala Arg Leu Lys Leu Cys Glu He Gly 
2240 2245 2250 



Ala Gly Ser Gly Gly Thr Thr Ala Ser Val Leu Gin Gin Leu Gin 
2255 2260 2265 



Ala Tyr Gly Glu His He Glu Glu Tyr Leu Tyr Thr Asp Leu Ser 
2270 2275 2280 



Pro Val Phe Leu His His Ala Glu Lys His Tyr Gin Pro Arg Ala 
2285 2290 2295 



Pro Tyr Leu Arg Thr Ala Cys Phe Asp Val Ala Arg Ala Pro Thr 
2300 2305 2310 



Gin- Air. Lei: Glu tsor Gly Gly Tyr Asp Vai V-J He Ala Tilz 
2315 2320 2325 



Asn Val Leu His Ala Thr Arg Asp He Ala Lys Thr Leu Arg Asn 
2330 2335 2340 



Ala Lys Ala Leu Leu Lys Pro Gly Gly Leu Leu Leu Leu Asn Glu 
2345 2350 2355 



Val He Glu Arg Ser Leu Val Leu His Leu Thr Phe Gly Leu Leu 
2360 2365 2370 



Glu Ser Trp Trp Leu Pro Gin Asp Lys He Leu Arg Leu Ala Gly 
2375 2380 2385 



Ser Pro Leu Leu Ala Cys Ala Thr Trp Arg Ser Leu Leu Glu Ala 
2390 2395 2400 



Glu Gly Phe Ala Gly Leu Ser Val His Arg Ala Gin Pro Asp Ala 
2405 2410 2415 
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Gly Gin Ala lie lie Cys Ala Tyr Ser Asp Gly lie Val Arg Gin 
2420 2425 2430 



Ala Ser Thr lie Glu Val Ala Arg Asn Glu Lys Val Thr Val Pro 
2435 2440 2445 



Ser Gin Pro Ala Glu Ala Gly Glu Ser Pro Leu Asp Leu Val Lys 
2450 2455 2460 



Lys Leu Leu Gly Arg lie Leu Lys Met Asp Pro Ala Thr Leu Asp 
2465 2470 2475 



Thr Ser His Pro Leu Glu Tyr Tyr Gly Val Asp Ser lie Val Ala 
2480 2485 2490 



lie Glu Leu Ala Met Ala Leu Arg Glu Thr Phe Pro Gly Phe Glu 
2495 2500 2505 



Val Ser Glu Leu Phe Glu Thr Gin Ser lie Asp Thr Leu Leu Gly 
2510 2515 2520 



Ser Leu Glu Gin Ala Pro Leu Leu Ala Thr Leu Thr Ala Pro Pro 
2525 2530 2535 



Gin Gin Asp Met Leu Gin Gin Leu Lys Gin Leu Leu Ala Arg Thr 
2540 2545 2550 



Leu Lys Leu Asp lie Thr Gin lie Asp Thr Ser Lys Thr Leu Glu 
2555 2560 2565 



Ser Tyr Gly Val Asp Ser He Val He lie Glu Leu Ala Asn Ala 
2570 2575 2580 



Leu Arg Glu Arg Tyr Pro Ser Leu Asp Ala Ser Gin Leu Met Glu 
2585 2590 2595 



Thr Leu Ser He Asp Arg Leu Val Ala Gin Trp Gin Ala Thr Glu 
2600 2605 2610 



Pro Ala Val Pro Ala Glu Pro Thr Ala Glu Pro Pro Val Ala Asp 
2615 2620 2625 



Glu Asp Ala Ala Ala He He Gly Leu Ala Gly Arg Phe Pro Gly 
2630 2635 2640 



Ala Asp Thr Leu Glu Glu Phe Trp Asn Asn Leu Arg Asn Gly Gin 
2645 2650 2655 
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Ser Ser Met Gly Glu Val Pro Gly Glu Arg Trp Asp His Gin His 
2660 2665 2670 



Tyr Phe Asp Ser Glu Arg Gin Ala Pro Gly Lys Thr Tyr Ser Arg 
2675 2680 2685 



Trp Gly Ala Phe Leu Arg Asp lie Asp Gly Phe Asp Ala Ala Phe 
2690 2695 2700 



Phe Glu Trp Pro Asp Ser Val Ala Leu Glu Ser Asp Pro Gin Ala 
2705 2710 2715 



Arg lie Phe Leu Glu Gin Ala Tyr Ala Gly lie Glu Asp Ala Gly 
2720 2725 2730 



Tyr Thr Pro Gly Ser Leu Ser Lys Ser Gin Arg Val Gly Val Phe 
2735 2740 2745 



Val Gly Val Met Asn Gly Tyr Tyr Ser Gly Gly Ala Arg Phe Trp 
2750 2755 2760 



Gin lie Ala Asn Arg Val Ser Tyr Gin Phe Asp Phe Arg Gly Pro 
2765 2770 2775 



Ser Leu Ala Val Asp Thr Ala Cys Ser Ala Ser Leu Thr Ala lie 
2780 2785 2790 



His Leu Ala Leu Glu Ser Leu Arg Ser Gly Ser Cys Glu Val Ala 
2795 2800 2805 



Leu Ala Gly Gly Val Asn Leu Leu Val Asp Pro Gin Gin Tyr Leu 
2810 2815 2820 



Asn Leu Ala Gly Ala Ala Met Leu Ser Ala Gly Ala Ser Cys Arg 
2825 2830 2835 



Pro Phe Gly Glu Ala Ala Asp Gly Phe Val Ala Gly Glu Ala Cys 
2840 2845 2850 



Gly Val Val Leu Leu Lys Pro Leu Lys Gin Ala Arg Ala Asp Gly 
2855 2860 2865 



Asp Val lie His Ala Val lie Arg Gly Ser Met lie Asn Ala Gly 
2870 2875 2880 



Gly His Thr Ser Ala Phe Ser Ser Pro Asn Pro Ala Ala Gin Ala 
2885 2890 2895 
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Glu Val Val Arg Gin Ala Leu Gin Arg Ala Gly Val Ala Pro Asp 
2900 2905 2910 



Ser He Ser Tyr He Glu Ala His Gly Thr Gly Thr Val Leu Gly 
2915 2920 2925 



Asp Ala Val Glu Leu Gly Ala Leu Asn Lys Val Phe Asp Lys Arg 
2930 2935 2940 



Ala Ala Pro Cys Pro He Gly Ser Leu Lys Ala Asn He Gly His 
2945 2950 2955 



Ala Glu Ser Ala Ala Gly He Ala Gly Leu Ala Lys Leu Val Leu 
2960 2965 2970 



Gin Phe Arg His Gly Glu Leu Val Pro Ser Leu Asn Ala Phe Pro 
2975 2980 2985 



Leu Asn Pro Tyr He Glu Phe Gly Arg Phe Gin Val Gin Gin Gin 
2990 2995 3000 



Pro Ala Pro Trp Pro Arg Arg Gly Ala Gin Pro Arg Arg Ala Gly 
3005 3010 3015 



Leu Ser Ala Phe Gly Ala Gly Gly Ser Asn Ala His Leu Val Val 
3020 3025 3030 



Glu Glu Ala Pro Ala Met Ala Pro Gly Val Ser He Ser Ala Ser 
3035 3040 3045 



Ser Pro Ala Leu He Val Leu Ser Ala Arg Thr Leu Pro Ala Leu 
3050 3055 3060 



Gin Gin Arg Ala Arg Asp Leu Leu Val Trp Met Gin Ala Arg Gin 
3065 3070 3075 



Val Asp Asp Val Met Leu Ala Asp Val Ala Tyr Thr Leu His Leu 
3080 3085 3090 



Gly Arg Val Ala Met Glu Gin Arg Leu Ala Phe Thr Ala Gly Ser 
3095 3100 3105 



Ala Ala Glu Leu Ser Glu Lys Leu Gin Ala Tyr Leu Gly His Ala 
3110 3115 3120 



He Arg Ala Asp He Tyr Leu Ser Glu Asp Thr Pro Gly Lys Pro 
3125 3130 3135 
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Ala Gly Ala Pro He Val Ala Glu Glu Asp Leu Leu Thr Leu Met 
3140 3145 3150 



Asp Ala Trp He Glu Lys Gly Gin Tyr Gly Arg Leu Leu Glu Tyr 
3155 3160 3165 



Trp Thr Lys Gly Gin Pro He Asp Trp Asn Lys Leu Tyr Trp Arg 
3170 3175 3180 



Lys Leu Tyr Ala Asp Gly Arg Pro Arg Arg He Ser Leu Pro Thr 
3185 3190 3195 



Tyr Pro Phe Glu His Arg Arg Tyr Trp Gin Thr Pro Val Pro Gly 
3200 3205 3210 



Glu Arg Ser Leu His Ala Thr Ala Pro Ala Thr Arg Glu Thr Val 
3215 3220 3225 



Ala Val Gly Ala Met Pro Asp Pro Ala Gly Ala Thr Val Gin Ala 
3230 3235 3240 



Arg Leu Cys Ala Leu Cys Gin Val Leu Leu Gly Lys Pro Val Thr 
3245 3250 3255 



Ala Gin Met Asp Phe Phe Ala Val Gly Gly His Ser Val Leu Ala 

3 2 5 r. 226b 32**0 



He Gin Leu Val Ser Arg He Arg Lys Ser Phe Gly Val Glu Tyr 
3275 3280 3285 



Pro Val Ser Ala Leu Phe Glu Ser Ala Leu Leu Ser Asp Met Ala 
3290 3295 3300 



Arg Gin He Glu Gin Leu Arg Val Asn Gly Val Ala Lys Arg Met 
3305 3310 3315 



Pro Ala Leu Leu Pro Ala Gly Arg Val Gly Ala He Pro Ala Thr 
3320 3325 3330 



Tyr Ala Gin Glu Arg Leu Trp Leu Val His Glu His Met Ser Glu 
3335 3340 3345 

Gin Arg Ser Ser Tyr Asn He Thr Phe Ala Met His Phe Arg Gly 
3350 3355 3360 



Val Asp Phe Arg Ala Glu Ala Met Arg Ala Ala Leu Asn Ala Leu 
3365 " 3370 3375 
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Val Val Arg His Glu Val Leu Arg Thr Arg Phe Leu Ser Glu Asp 
3380 " 3385 3390 



Gly Gin Leu Gin Gin Val lie Ala Ala Ser Leu Thr Leu Glu Val 
3395 3400 3405 



Pro Val Arg Glu Met Ser Val Glu Glu Val Asp Leu Leu Leu Ala 
3410 3415 3420 



Ala Ser Thr Arg Glu Thr Phe Asp Leu Arg Gin Gly Pro Leu Phe 
3425 3430 3435 



Lys Ala Arg lie Leu Arg Val. Ala Ala Asp His His Val Val Leu 
3440 3445 3450 



Ser Ser He His His He He Ser Asp Gly Trp Ser Leu Gly Val 
3455 3460 3465 



Phe Asn Arg Asp Leu His Gin Leu Tyr Glu Ala Cys Leu Arg Gly 
3470 3475 3480 



Thr Pro Pro Thr Leu Pro Thr Leu Ala Val Gin Tyr Ala Asp Tyr 
3485 3490 3495 



Ala Leu Trp Gin Arg Gin Trp Glu lieu Ala Ala Pro Leu Ser Tyr 

3500 3505 3510 



Trp Thr Arg Ala Leu Glu Gly Tyr Asp Asp Gly Leu Asp Leu Pro 
3515 3520 3525 



Tyr Asp Arg Pro Arg Gly Ala Thr Arg Ala Trp Arg Ala Gly Leu 
3530 3535 3540 



Val Lys His Arg Tyr Pro Pro Gin Leu Ala Gin Gin Leu Ala Ala 
3545 3550 3555 



Tyr Ser Gin Gin Tyr Gin Ala Thr Leu Phe Met Ser Leu Leu Ala 
3560 3565 3570 



Gly Leu Ala Leu Val Leu Gly Arg Tyr Ala Asp Arg Lys Asp Val 
3575 3580 3585 



Cys He Gly Ala Thr Val Ser Gly Arg Asp Gin Leu Glu Leu Glu 
3590 3595 3600 



Glu Leu He Gly Phe Phe He Asn He Leu Pro Leu Arg Val Asp 
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3605 3610 3615 

Leu Ser Gly Asp Pro Cys Leu Glu Glu Val Leu Leu Arg Thr Arg 
3620 3625 3630 

Gin Val Val Leu Asp Gly Phe Ala His Gin Ser Val Pro Phe Glu 
3635 3640 3645 

His Val Leu Gin Ala Leu Arg Arg Gin Arg Asp Ser Ser Gin He 
3650 3655 3660 

Pro Leu Val Pro Val Met Leu Arg His Gin Asn Phe Pro Thr Gin 
3665 3670 3675 

Glu He Gly Asp Trp Pro Glu Gly Val Arg Leu Thr Gin Met Glu 
3680 3685 3690 

Leu Gly Leu Asp Arg Ser Thr Pro Ser Glu Leu Asp Trp Gin Phe 
3695 3700 3705 

Tyr Gly Asp Gly Ser Ser lieu Glu Leu Thr Leu Glu Tyr Ala Gin 
3710 3715 3720 

Asp Leu Phe Asp Glu Ala Thr Val Arg Arg Met He Ala His His 
3725 3730 3735 

Gin Gin Ala Leu Glu Als Met Val Ser Arg Pro Gin Leu />rg Vfcl - 
3740 3745 3750 

Gly Lys Trp Asp Met Leu Thr Ala Glu Glu Arg Arg Leu Phe Ala 
3755 3760 3765 

Ala Leu Asn Ala Thr Gly Thr Pro Arg Glu Trp Pro Ser Leu Ala 
3770 3775 3780 

Gin Gin Phe Glu Arg Gin Ala Gin Ala Thr Pro Gin Ala He Ala 
3785 3790 3795 

Cys Val Ser Asp Gly Gin Ser Trp Ser Tyr Ala Gin Leu Glu Ala 
3800 3805 3810 

Arg Ala Asn Gin Leu Ala Gin Ala Leu Arg Gly Gin Gly Ala Gly 
3815 3820 3825 

Arg Asp Val Arg Val Ala Val Gin Ser Ala Arg Thr Pro Glu Leu 
3830 3835 3840 
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Leu Met Ala Leu Leu Ala lie Phe Lys Ala Gly Ala Cys Tyr Val 
3845 3850 3855 



Pro He Asp Pro Ala Tyr Pro Ala Ala Tyr Arg Glu Gin He Leu 
3860 3865 3870 



Ala Glu Val Gin Val Ser He Val Leu Glu Gin Asp Glu Leu Ala 
3875 3880 3B85 



Leu Asp Glu Gin Gly Gin Phe His Asn Pro Arg Trp Arg Glu Gin 
3890 3895 3900 



Ala Pro Thr Pro Leu Gly Leu Arg Glu His Pro Gly Asp Leu Ala 
3905 3910 3915 



Cys Val Met Val Thr Ser Gly Ser Thr Gly Arg Pro Lys Gly Val 
3920 3925 3930 



Met Val Pro Tyr Ala Gin Leu His Asn Trp Leu His Ala Gly Trp 
3935 3940 3945 



Gin Arg Ser Ala Phe Glu Ala Gly Glu Arg Val Leu Gin Lys Thr 
3950 3955 3960 



Ser He Ala Phe Ala Val Ser Val Lys Glu Leu Leu Ser Gly Leu 
3965 3970 3975 



Leu Ala Gly Val Glu Gin Val Met Leu Pro Asp Glu Gin Val Lys 
3980 3985 3990 



Asp Ser Leu Ala Leu Ala Arg Ala He Glu Gin Trp Gin Val Thr 
3995 4000 4005 



Arg Leu Tyr Leu Val Pro Ser His Leu Gin Ala Leu Leu Asp Ala 
4010 4015 4020 



Thr Gin Gly Arg Asp Gly Leu Leu His Ser Leu Arg His Val Val 
4025 4030 4035 

Thr Ala Gly Glu Ala Leu Pro Ser Ala Val Arg Glu Thr Val Arg 
4040 4045 4050 



Val Arg Leu Pro Gin Val Gin Leu Trp Asn Asn Tyr Gly Cys Thr 
4055 4060 4065 



Glu Leu Asn Asp Ala Thr Tyr His Arg Ser Asp Thr Val Ala Pro 
4070 4075 4080 
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Gly Thr Phe Val Pro lie Gly Ala Pro lie Ala Asn Thr Glu Val 
4085 4090 4095 



Tyr Val Leu Asp Arg Gin Leu Arg Gin Val Pro lie Gly Val Met 
4100 4105 4110 



Gly Glu Leu His Val His Ser Val Gly Met Ala Arg Gly Tyr Trp 
4115 4120 4125 



Asn Arg Pro Gly Leu Thr Ala Ser Arg Phe lie Ala His Pro Tyr 
4130 4135 4140 



Ser Glu Glu Pro Gly Thr Arg Leu Tyr Lys Thr Gly Asp Met Val 
4145 4150 4155 



Arg Arg Leu Ala Asp Gly Thr Leu Glu Tyr Leu Gly Arg Gin Asp 
4160 4165 4170 



Phe Glu Val Lys Val Arg Gly His Arg Val Asp Thr Arg Gin Val 
4175 4180 4185 



Glu Ala Ala Leu Arg Ala Gin Pro Ala Val Ala Glu Ala Val Val 
4190 4195 4200 



Ser Gly His Arg Val Asp Gly Asp Met Gin Leu Val Ala Tyr Val 
4205 4210 4215 



Val Ala Arg Glu Gly Gin Ala Pro Ser Ala Gly Glu Leu Lys Gin 
4220 4225 4230 



Gin Leu Ser Ala Gin Leu Pro Thr Tyr Met Leu Pro Thr Val Tyr 
4235 4240 4245 



Gin Trp Leu Glu Gin Leu Pro Arg Leu Ser Asn Gly Lys Leu Asp 
4250 4255 4260 



Arg Leu Ala Leu Pro Ala Pro Gin Val Val His Ala Gin Glu Tyr 
4265 4270 4275 



Val Ala Pro Arg Asn Glu Ala Glu Gin Arg Leu Ala Ala Leu Phe 
4280 4285 4290 



Ala Glu Val Leu Arg Val Glu Gin Val Gly He His Asp Asn Phe 
4295 4300 4305 



Phe Ala Leu Gly Gly His Ser Leu Ser Ala Ser Gin Leu He Ser 
4310 4315 4320 
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Arg lie Arg Gin Ser Phe His Val Asp Leu Pro Leu Ser Arg He 
4325 4330 4335 



Phe Glu Ala Pro Thr He Glu Gly Leu Val Arg Gin Leu Ala Leu 
4340 4345 4350 



Pro Ser Glu Gly Gly Val Ala Ser He Ala Arg Val Ala Arg Asn 
4355 4360 4365 



Arg Thr He Pro Leu Ser Leu Phe Gin Glu Arg Leu Trp Phe Val 
4370 4375 4380 



His Gin His Met Pro Glu Gin Arg Thr Ser Tyr Asn Gly Thr Leu 
4385 4390 4395 



Ala Leu Arg Leu Arg Gly Pro Leu Ser Val Glu Ala Met Arg Ala 
4400 4405 4410 



Ala Leu Arg Ala Leu Val Leu Arg His Glu He Leu Arg Thr Arg 
4415 4420 4425 



Phe Val Leu Pro Thr Gly Ala Ser Glu Pro Val Gin Val He Asp 
4430 4435 4440 



Glu His Ser Asp Phe Gin Leu Ser Val Gin Leu Val Glu Asp Thr 
4445 4450 4455 



Glu He Ala Ser Leu Met Asp Glu Leu Ala Ser His He Tyr Asp 
4460 4465 4470 



Leu Ala Asn Gly Pro Leu Phe He Ala Cys Leu Leu Gin Leu Asp 
4475 4480 4485 



Glu Gin Glu His Val Leu Leu He Gly Met His His Leu He Tyr 
4490 4495 4500 



Asp Ala Trp Ser Gin Phe Thr Val Met Asn Arg Asp Leu Arg Val 
4505 4510 4515 



Leu Tyr His Arg His Leu Gly Leu Ala Gly Gly Asp Leu Pro Glu 
4520 4525 4530 



Leu Pro He Gin Tyr Ala Asp Tyr Ala He Trp Gin Arg Ala Gin 
4535 4540 4545 



Asn Leu Asp Ala Gin Leu Ala Tyr Trp Gin Ala Met Leu His Asp 
4550 4555 4560 
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Tyr Asp Asp Gly Leu Glu Leu Pro Tyr Asp Tyr P« Arg Pro Arg 
4565 4570 

Asn Arg Thr Trp His Ala Ala Val Tyr Thr His Tta Tyr Pro Ala. 
4580 4585 



Glu Leu Val Gin Arg Phe Ala Gly Phe Val Gin His Gin Ser 

4595 4600 

Thr Leu Phe lie Gly Leu Leu Ala Ser Phe Ala V* Val Leu Asn 
4610 4615 " * 

Ly8 xyr Thr Gly Arg Asp Asp Leu Cys He Gly Thr_ Thr Thr Ala 
4625 4630 

Gly Arg Thr His Leu Glu Leu Glu Asn Leu He Gly Phe Phe He 



4640 



4645 



Asn He Leu Pro 
4655 



Leu Arg Leu Arg Leu Asp Gly Asp Pro Asp Val 



4660 



4665 



Ala Glu lie Met Arg Arg Thr Arg Leu Val Ala Met Ser Ala Phe 
4670 4675 4680 

GluAsn Gin Ala Leu Pro Phe Glu His Leu Leu Asn Ala Leu His 
4C35 4690 4t * J 

LysGln Arg Asp Thr Ser Arg He Pro Leu Val j» Val Val Met 
4700 4705 



Arg His Gin Asn Phe Pro Asp Thr He Gly Asp Ser Asp Gly 

4715 4720 

lie Arg Thr Glu Val He Gin Arg Asp Leu Arg Ala Thr Pro Asn 



4730 



4735 



Glu Met Asp Leu Gin Phe Phe Gly Asp Gly Thr Gly Leu Ser Val 
4745 4750 4755 

Thr val Glu Tyr Ala Ala Glu Leu Phe Ser Glu Ala^ Thr He Arg 



4760 



4765 



Arg Leu He His His His Gin Leu Val Leu Glu Gin Met Leu Ala 
9 4775 4780 4785 

Ala His Glu Ser Ala Thr Cys Pro Leu Asp Val Ala ^ Asp 
4790 4795 
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<212> DNA 
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<220> 
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<223> 



<220> 
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^23> Acyl-CoA ligase subdomaxn I 



<220> 
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<223> Acyl-CoA ligase subdomaxn II 



<220> 
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<222> (897) (913) 
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synthase 1 subdomaxn 
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<223> 



Beta-ketoacyl reductase domain 



<220> 
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<222> (667).. (678) 

<223> Acyl carrier protein 1 domaxn 



<220> 
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<222> (2484).. (2495) 

<223> Acyl carrier protexn 2 domaxn 



<220> 

<221> misc_feature 

<222> (2568) (2579) 

<223> Acyl carrier protein 3 domaxn 



<220> 
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<222> (3806) .. (3811) 

<223> Adenylation subdomain I 
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<223> Adenylation subdomain 
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<223> Adenylation subdomaxn IV 
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<223> Adenylation subdomain 
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<223> Adenylation subdomain 
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<222> (4170) . . (4189) 

<223> Adenylation subdomain VIix 
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<223> Adenylation subdomain IX 
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<223> Peptidyl carrier protein 2 domain 
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<223> Condensation domain 1 subdomain V 
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<223> Condensation domain 2 subdomain II 

<220> 

<221> misc__feature 

<-922> (4498) . - (4507) 

<223> Condensation domain 2 subdomam III 
<220> 

<221> miscJEeature 
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<221> misc feature 

,oov (4642) . . (4659) , 

Condensation domain 2 subdomain V 
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<220> 

<221> misc_feature 



<400> 3 . tt pta gcc g tc cag ttt gca 

S £ £ S £ 2 2 £ S S ?.x !u v.x «. - - 

1 5 

^ ant- cac qcq gcg ate ccc aat aag gcg 
ggc gta ttg tta ggc gtc acc get cgc gcg 9 9 ^ ^ ^ ^ 
Gly Val Leu Leu Gly Val Thr Ala Arg 3Q 

20 25 

- * s c s s s s £ s s 9 » s s s s s 

Gly Met Arg Arg Ala Trp Pro pro r 4g 



144 



35 40 
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att get tac etc atg cag aga teg cca atg teg ccg tta cag caa acg 
He Ala Tyr Leu Met Gin Arg Ser Pro Met Ser Pro Leu Gin Gin Thr 
1 " 60 



50 55 



ctg eta ace cgc etc gee agt gcg gee gee tec egg aca atg ate gag 
Leu Leu Thr Arg Leu Ala Ser Ala Ala Ala Ser Arg Thr Met He Glu 
65 70 75 80 

ttt ccg cgt ccg gag cac gca teg cca caa tgt tgc gac gat gee gag 
Phe Pro Arg Pro Glu His Ala Ser Pro Gin Cys Cys Asp Asp Ala Glu 
85 90 95 

ctt gcg cga ctg ate gtg cag ttg teg gcg gga ctg caa ccg ctg gcg 
Leu Ala Arg Leu He Val Gin Leu Ser Ala Gly Leu Gin Pro Leu Ala 
100 105 HO 

atg ccg ggt ace tac gtg ate att gee gcg cca cat ggt ggt ttg ttc 
Met Pro Gly Thr Tyr Val He He Ala Ala Pro His Gly Gly Leu Phe 
115 120 125 

gcg gca gee ctg ctt gee tgt ttg cat gee aac ctg gtg gcg gtg ccg 
Ala Ala Ala Leu Leu Ala Cys Leu His Ala Asn Leu Val Ala Val Pro 
130 135 140 

ttt cca ctg gat gtt get cag cca aat gag egg gaa cag gee agg ctg 
Phe Pro Leu Asp Val Ala Gin Pro Asn Glu Arg Glu Gin Ala Arg Leu 
145 150 155 I*™ 

gag acg ate cac gca caa ttg atg gag cat ggc aat gta gcg gtt ctg 
Glu Thr He His Ala Gin Leu Met Glu His Gly Asn Val Ala Val Leu 
165 170 175 

ctt gac gat gtc gee gat cgc agt gee ttc gcg cgc atg gcg cat get 
Leu Asp Asp Val Ala Asp Arg Ser Ala Phe Ala Arg Met Ala His Ala 
1GC 185 150 

gcg ggc ace ttc ctg gcg ace ttc gee gat eta aag cgc gaa teg acc 
Ala Gly Thr Phe Leu Ala Thr Phe Ala Asp Leu Lys Arg Glu Ser Thr 
195 200 205 

age gee tec ttg tgc ccg gcg teg cct teg gac gee gee ttg ctg ttg 
Ser Ala Ser Leu Cys Pro Ala Ser Pro Ser Asp Ala Ala Leu Leu Leu 
210 215 220 

ttt acc tct ggt tec teg ggt gag tec aag ggc ate ctg ctt age cac 
Phe Thr Ser Gly Ser Ser Gly Glu Ser Lys Gly He Leu Leu Ser His 
225 230 235 240 

cgc aac ctg cat cat cag ate cag get ggc ate egg cag tgg age ttg 
Arg Asn Leu His His Gin He Gin Ala Gly He Arg Gin Trp Ser Leu 
245 250 255 

gac gag cat age cat gtg gtg acc tgg ctt tct ccc gcg cac aac ttc 
Asp Glu His Ser His Val Val Thr Trp Leu Ser Pro Ala His Asn Phe 
260 265 270 

ggc ctg cat ttc ggc ttg ctg gca ccc tgg ttc agt ggc gcg acg gtc 
Gly Leu His Phe Gly Leu Leu Ala Pro Trp Phe Ser Gly Ala Thr Val 
275 280 285 

agt ttc ate cat ccg cac agt tat atg aaa cga ccc ggc ttc tgg ctg 
Ser Phe He His Pro His Ser Tyr Met Lys Arg Pro Gly Phe Trp Leu 
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995 300 
290 295 



a x ss s K.5 c e s s 2 s s s = £ 



320 

305 



310 315 



s = 5SSS53 a s e s a £ = = 



325 



tct acq ttg tct acg ctt acg cat ate gtg tgt ggc ggc gag ccg gtg 
Ser S Leu ser Th? Leu Thr His He Val Cys Gly Gly Glu Pro Val 
340 345 

B «-r *i-a caa cac ttc ttc gag aaa ttc gec gga etc ggt 
S S Ser Tar 2 S Arg Phe Phe Glu Lys Phe Ala Gly Leu Gly 
a 360 365 



355 



„- =»r.o «a act ttc atg ccg cac ttc ggc ttg tct gaa acc ggt 
S S tS SS Thr p£ Sec Pro His Phe Gly Leu Ser Glu Thr Gly 
370 375 380 

etc aot acc ttg gac gag gcg ccc caa cag cgc gtc ttg gaa eta 
S 2 S Leu Lp Glu La Pro Gin Gin Arg Val Leu Glu Leu 

385 395 

™^ aac acc ttg aac aaa cgc aag cgc gtg gcg gca ggg gcg age 
Zl E Asp £a Leu Asn Lys Arg Lys Arg val Ala Ala Gly Ala ser 
405 410 

caa aca cat gtg aca gtg etc aat tgc ggc gee gtc gac caa gat gtg 
£n Ala Arg Va! Thr Va? Leu Asn Cys Gly Ala Val Asp Gin Asp Val 

425 



qaq tta cat ate gtc tgt cct gaa ggc gag acg ttg tgc aga cca gat 

IS S Arg So val Cys Pro Glu Gly Glu Thr M Cyc teg Pre fts? 



435 



440 



gag ate ggc gaa ata tgg gta aag teg cct gcg ate gee cgt ggc tac 
III lie Gly Glu lie Trp Val Lys Ser Pro Ala lie Ala Arg Gly Tyr 
450 «5 

ctg ttt gcg aag ccc gee gat cag cga cag ttc aac tgc age ate cgt 
S Phe Ala LyJ Pro Ala Asp Gin Arg Gin Phe Asn Cys Ser He Arg 
465 470 475 

aac aat aac ggt tac ttt cgt acc ggc gac ctg ggt ttc att 
S £ K S S llV Tyr K. Thr Oly «P Leu Gly P£ n. 

485 490 

r,on aat aac tqt ctg tat gtc acc gga agg gta aag gag gtg ctg ate 
Sa Asp Sy Cys Leu Tyr Val Thr Gly Arg Val Lys Glu Val Leu He 



500 



ata cgc ggt aag aat cat tac ccc gca cat ate gaa gee teg ate gee 
111 Sg SS Lys Asn His Tyr Pro Ala His He Glu Ala Ser lie Ala 
515 520 525 

^ » w en* hra cct aac qcg ctg atg ccg gtg gtg ttc age ate gag 
get acc gca teg cct ggc gcg y y » Glu 

Ala Thr Ala Ser Pro Gly Ala Leu Met Pro vai vai 

530 535 540 

egg cag gac gag gag cgc gta get gcg gtg ate gec gtc aat cac ccg 
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Arg Gin Asp Glu Glu Arg Val Ala Ala Val He Ala Val Asn His Pro 
545 550 555 560 

tgg acg ccg gca gca tgc gcc gcg cag gca cac aag ate egg caa cag 1728 
Trp Thr Pro Ala Ala Cys Ala Ala Gin Ala His Lye He Arg Gin Gin 
565 570 575 

gta gcc gac cag cat gga gtc gcc ctg gcg gag eta gcc ttt gcc gaa 1776 
Val Ala Asp Gin HiB Gly Val Ala Leu Ala Glu Leu Ala Phe Ala Glu 
580 585 590 

cac egg cac gtg ttc ggc ace tat ccg ggc aaa ctg aag egg cgc eta 1824 
His Arg His Val Phe Gly Thr Tyr Pro Gly Lys Leu Lys Arg Arg Leu 
595 600 605 

gtc aag gaa gcc tat gtc aac ggc cag ctg ccg ttg tta tgg cat gag 1872 
Val Lys Glu Ala Tyr Val Asn Gly Gin Leu Pro Leu Leu Trp His Glu 
610 615 620 

ggt aag aac egg gac gta cca gcg gcc gcc gcg gac gat egg cag gcg 1920 
Gly Lys Asn Arg Asp Val Pro Ala Ala Ala Ala Asp Asp Arg Gin Ala 
625 630 635 640 

caa cac gtg gcg gac ctg tgt egg aag gtc ttt ttg ccg gtg ttg ggt 1968 
Gin His Val Ala Asp Leu Cys Arg Lys Val Phe Leu Pro Val Leu Gly 
645 650 655 

gtc gcg ccg ccg cat gcc caa tgg ccg ctg tgc gaa ctg gcg ctg gat 2016 
Val Ala Pro Pro His Ala Gin Trp Pro Leu Cys Glu Leu Ala Leu Asp 

660 665 670 

teg etc caa tgc gtg cgt ctt gcc ggt gcc ate gaa gag tgc tac ggc 2064 
Ser Leu Gin Cys Val Arg Leu Ala Gly Ala He Glu Glu Cys Tyr Gly 
675 680 685 

gtg cct ttc gaa ccc acg ttg eta ttc aag ctt gag acg gtc ggg gca 2112 

- l a.l £>rv £he G.m Pre Thr-.. Lev, Leu Phe j^yn --eu Glu Tnr Va'i ely : c 
690 695 700 

ate gcc gaa tat gtc ctg gcg cac gga cgt cag gcg ccc acg ccg acg 2160 
He Ala Glu Tyr Val Leu Ala His Gly Arg Gin Ala Pro Thr Pro Thr 
705 710 715 720 

cgt gcg ccg gtg gca age aca aca tgc tea gag gaa ccg ate gcc att 2208 
Arg Ala Pro Val Ala Ser Thr Thr Cys Ser Glu Glu Pro He Ala He 
725 730 735 



gtg gcg atg cac tgt gag gtg ccc gga gcg ggc gag aac act gaa gca 
Val Ala Met His Cys Glu Val Pro Gly Ala Gly Glu Asn Thr Glu Ala 
740 745 750 



2256 



ttg tgg teg ttc ctg egg age gac gtc aac gcg ate egg ccg ate gaa 2304 
Leu Trp Ser Phe Leu Arg Ser Asp Val Asn Ala He Arg Pro He Glu 
755 760 765 

tea acg cgc ccg gac tta tgg gca gcg atg cgc gcc tat ccc ggc etc 2352 
Ser Thr Arg Pro Asp Leu Trp Ala Ala Met Arg Ala Tyr Pro Gly Leu 
770 775 780 

gcg ggc gaa cag ctg ccg cgc tat gcg ggt ttc etc gac gac gtt gat 2400 
Ala Gly Glu Gin Leu Pro Arg Tyr Ala Gly Phe Leu Asp Asp Val Asp 
785 790 795 800 

get ttc gat get gcg ttt ttc ggt ate teg cgt cgc gag gcc gaa tgc 2448 
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Ala Pbe Asp Ala Ala Phe Phe Gly He Ser Arg Arg Glu Ala Glu Cys 



805 



810 



S 5 = = 2 s s s = = = a s S = 2 

820 825 

S 2 E 2 = 5 S'2 = 3 5 5 = = C 5 

835 840 845 

.ta ttc qtg ggt gcg cat acg tec gac tat ggc gag ctg ctg gcg age 
Leu Phe S3 G?y Ala His Thr Ser Asp Tyr Gly Glu Leu Leu Ala Ser 
850 855 860 

rf« ata ace caa tgt ggc get tac ate gat teg ggt teg 
2 2 2 £ 9 » £ K i. Gly Ala Tyr XI. «P Oly Ser 



865 



2 2 S S £ 2 = SS £ S 2 52 2 £ = 



885 



890 



aac ccc age gaa gta ate aac age get tgc tec age teg ctg gtg gcg 
gj So ser Glu ?al He Asn Ser Ala Cys Ser Ser Ser Leu Val Ala 

900 905 

eta cat egg gcg gtt caa teg ctg cgc caa ggc gaa age agt gtc gee 
2 h!s Arg LI Val Gin Ser Leu Arg Glu Gly Glu Ser Ser Val Ala 
915 9 20 925 

eta ata etc ggc gtg aac ctt ate ctg get ccc aag gtg ctg tta gee 
Leu val Leu SJ Val Asn Leu He Leu Ala Pro Lys Val Leu Leu Ala 



930 



a ^ gc^gge ats ctt teg ecc gat cgc tgc aag acg ctfr 
^ Si S« Ala Sy Met Leu Ser Pro Asp Gly Arg Cys Lys Thr Leu 



945 



950 955 960 



2496 



2544 



2592 



2640 



2688 



2736 



2784 



2832 



2928 



2976 



- s s e 2 ^ s s s 2 2 s £ sr. s s 

S 2 S 2 = S 2 S 2 2 2 2 2 S 2 2 

980 965 

_ „ ta atc cqc ggc gtg gcg gtc aac cat ggc ggc cgt tec aat tec 3024 
Z Leu S Sg G?y va? Ill Si Asn His Gly Gly Arg Ser Asn Ser 
995 1000 1005 

ttg cgt get ecc aac gtc aac gcg cag egg caa ctg ctg ate egg 3069 
Leu Arg Ala Pro Asn Val Asn Ala Gin Arg Gin Leu Leu He Arg 
1010 1020 

act tac cag gaa gee ggt gtc gag ccg gec age gtc ggt tat gtt 3114 
TnrTyr Gin Glu Ala Sly Val Glu Pro Ala Ser Val Gly Tyr Val 

1025 1030 1° 35 

gaa eta eae ggc act ggt ace age ctg ggt gat ccg ate gaa ate 
Glu Leu His Gly Thr Gly Thr Ser Leu Gly Asp Pro lie Glu He 
1040 1045 1° 50 

-50- 



WO 02/024736 



PCT/AU01/01190 



caq gcg ctg aag gaa get ttc att gcg ttg ggg gca cag gec gee 3204 
Gin IS Leu Lys Glu Ala Phe He Ala Leu Gly Ala Gin Ala Ala 
1055 1060 1065 

ccg tea aac tgc ggc ate ggt teg gtg aag tec gcg ctg ggc cat 3249 
Pro Ser Abu Cys Gly He Gly Ser Val Lys Ser Ala Leu Gly His 
1070 1075 1080 

eta gaa gee get gca ggc ctg ace ggc ctg ate aag gtg ctg ctg 3294 
Leu Glu Ala Ala Ala Gly Leu Thr Gly Leu He Lys Val Leu Leu 
1085 1090 1095 

atg etc aag cac ggc gag cag gec ggc acg cgc cat ttc age acg 3339 
Met Leu Lys His Gly Glu Gin Ala Gly Thr Arg Hxs Phe Ser Thr 
1100 H° 5 1110 

etc aat ccg ctg ate gat ttg cga ggt acg tea ttc gaa gtg gtg 3384 
Leu Asn Pro Leu He Asp Leu Arg Gly Thr Ser Phe Glu Val Val 
1115 1120 H25 

qcg cag cat cgc gca tgg ccg teg cag gtc ggc att cac ggc aca 3429 
Ala Gin His Arg Ala Trp Pro Ser Gin Val Gly He His Gly Thr 
1130 H35 II 40 

etc ttg ccg cgt cgc gcg ggt ate age tea ttc ggc ttc ggc ggc 3474 
Leu Leu Pro Arg Arg Ala Gly He Ser Ser Phe Gly Phe Gly Gly 
H45 1150 1155 

gec aat gcg cat gcg ate gtg gaa gag cat gtc att gee acg ccc 3519 
Ala Asn Ala His Ala He Val Glu Glu His Val He Ala Thr Pro 
1160 H65 H70 

ccc teg acg age tec get ggc ggc ccg gta ggt ate gtg ttg tea 3564 
Pro Ser Thr Ser Ser Ala Gly Gly Pro Val Gly He Val Leu Ser 
1175 HBO 1185 

gec ggt agt gaa get gtc ttg egg caa caa gtg ctg gee ttg tea 3609 
Ala Gly Ser Glu Ala Val Leu Arg Gin Gin Val Leu Ala Leu Ser 
1190 1195 1200 

gee tgg eta agg cag caa teg ccg aca ccc gcg caa atg ate gat 3654 
Ala Trp Leu Arg Gin Gin Ser Pro Thr Pro Ala Gin Met He Asp 
1205 1210 1215 

gtc gec tac acc tta cag gta gga cgc gca gec ctg teg cac agg 3699 
Val Ala Tyr Thr Leu Gin Val Gly Arg Ala Ala Leu Ser Hxs Arg 
1220 1225 1230 

ttg get ttt age gcg acg gac gec gag cag gca ttg gcg agg ctt 3744 
Leu Ala Phe Ser Ala Thr Asp Ala Glu Gin Ala Leu Ala Arg Leu 
12 35 1240 1245 

gag ggt cgt ctg gcg ggc gtg atg gat gee gag gtc cat cac ggt 3789 
Glu Gly Arg Leu Ala Gly Val Met Asp Ala Glu Val His His Gly 
1250 1255 1260 

gtc gtg gat get gec gca acg get ccc gaa cat ggg egg cag acg 3834 
Val val Asp Ala Ala Ala Thr Ala Pro Glu His- Gly Arg Gin Thr 
1265 1270 1275 

cgc gaa ggt ctt gee ggt ttg ctg cga gec tgg act cag ggc gtg 3879 
Arg Glu Gly Leu Ala Gly Leu Leu Arg Ala Trp Thr Gin Gly Val 
1280 1285 1290 
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cgc gtc gat tgg teg gcg ctg tac ggc ata cag cga 
Arg Val Asp Trp Ser Ala Leu Tyr Gly lie Gin Arg 
1295 1300 1305 



gtt age ctg cct gtc tac ccc 
Val Ser Leu Pro Val Tyr Pro 
1310 1315 



ttc get agg gaa cgc 
Phe Ala Arg Glu Arg 
1320 



ccc ggc cag get atg cat gee get gcg gac get cat 
Pro Gly Gin Ala Met His Ala Ala Ala Asp Ala His 
1325 1330 1335 



ccg cag cgc 
Pro Gin Arg 



tat tgg ctg 
Tyr Trp Leu 



ccg atg ctg 
Pro Met Leu 



cag ctg ttg cat gec aat gee 
Gin Leu Leu His Ala Asn Ala 
1340 1345 



aaa eta cat cgc tac gee ttg cgt 
Lys Leu His Arg Tyr Ala Leu Arg 
1350 



agg tec ggc tgc gca age ttt ctt gtt gat cat tgc gtg gat ggt 
Arg Ser Gly Cys Ala Ser Phe Leu Val Asp His Cys Val Asp Gly 
1355 1360 1365 



cga cag gta eta ccg gca gee 
Arg Gin Val Leu Pro Ala Ala 

1370 1375 
gtg gcg cag egg gtc atg gcg 
Val Ala Gin Arg Val Met Ala 

1385 1390 

gcg cag gtc gee ttt ttg cat 
Ala Gin Val Ala Phe Leu His 
1400 1405 

ctg gag gtc gaa ate gaa ctg 
Leu Glu Val Glu He Glu Leu 
1415 1420 

gat ttc caa ctt cac gat get 
Asp Phe Gin Leu His Asp Ala 
1430 1435 



gtg caa ctg gaa ttg 
Val Gin Leu Glu Leu 
1380 

cag gat gag ggt tgt 
Gin Asp Glu Gly Cys 
1395 

ccc etc atg atg gag 
Pro Leu Met Met Glu 
1410 

teg aag age gat caa 
Ser Lys Ser Asp Gin 
1425 

cac cgc caa cag gtc 
His Arg Gin Gin Val 
1440 



gtg cgc gee 
Val Arg Ala 

ate gaa ctg 
He Glu Leu 



gag act gag 
Glu Thr Glu 



gat gag ttc 
Asp Glu Phe 



ttt age cag 
Phe Ser Gin 



ggg cac gta cgt cgc egg gtc tat acg gcg aca ccg 
Gly His Val Arg Arg Arg Val Tyr Thr Ala Thr Pro 
1445 1450 1455 



tta gee cag ctg caa aag ctt 
Leu Ala Gin Leu Gin Lys Leu 
1460 1465 



tgt gee gag cgc gtg 
Cys Ala Glu Arg Val 
1470 



gaa gac tgt tat gcg cac ttc ace gee tgc gga ttg 
Glu Asp Cys Tyr Ala His Phe Thr Ala Cys Gly Leu 
1475 1480 1485 



cgc ttg gat 
Arg Leu Asp 



ttg tec ggc 
Leu Ser Gly 



cag etc ggc 
Gin Leu Gly 



gac egg etc aaa tec gtg caa 
Asp Arg Leu Lys Ser Val Gin 
1490 1495 



teg ate ggc tgc gga cgc aat ggc 
Ser He Gly Cys Gly Arg Asn Gly 
1500 



gag ggc gag ccg ate gca ttg ggt gtc ctg cgc ctg 
Glu Gly Glu Pro He Ala Leu Gly Val Leu Arg Leu 
1505 1510 1515 



age gtt gaa gac age cat gtg 
Ser Val Glu Asp Ser His Val 
1520 1525 



ctg cct cct age ctg 
Leu Pro Pro Ser Leu 
1530 



cca cca tea 
Pro Pro Ser 



ctt gat ggt 
Leu Asp Gly 



3924 



3969 



4014 



4059 



4104 



4149 



4194 



4239 



4284 



4329 



4374 



4419 



4464 



4509 



4554 



4599 
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gcc ttg cag tgt age ctt ggc 
Ala Leu Gin Cys Ser Leu Gly 
1535 1540 

gcc atg cca tac acg ctg gag 
Ala Met Pro Tyr Thr Leu Glu 
1550 1555 



ttg cag cgt gat gtc 
Leu Gin Arg Asp Val 
1545 

egg atg acg gtg cat 
Arg Met Thr Val His 
1560 



cct ccc gag gcc tgg gtg ctg ctg cgt cac ggc cat 
Pro Pro Glu Ala Trp Val Leu Leu Arg His Gly His 
1565 1570 1575 



gag cac ate 
Glu His He 



gcg ccg att 
Ala Pro He 



gca gcc aga 
Ala Ala Arg 



cag tec ctg gac ate gat etc 
Gin Ser Leu Asp He Asp Leu 
1580 1585 



ctg gat tec gaa ggt agg gtc tgc 
Leu Asp Ser Glu Gly Arg Val Cys 
1590 



gtc age etc ggc aat tac acc ggc cgt gca ccg aaa 
val Ser Leu Gly Asn Tyr Thr Gly Arg Ala Pro Lys 
159 5 1600 1605 



gcc gtt tec 
Ala Val Ser 



gcc gtc agg gcg ctt gtc ttg 
Ala Val Arg Ala Leu Val Leu 
1610 1615 



gca ccg gtc tgg caa gcg ttg acc 
Ala Pro Val Trp Gin Ala Leu Thr 
1620 



gaa acg gcg ccg gca tgg ccc gat ccg gcc gaa cgc 
Glu Thr Ala Pro Ala Trp Pro Asp Pro Ala Glu Arg 
1625 1630 1635 



ate gtt acg 
He Val Thr 



gta gga gac gat gca tgg cgt 
Val Gly Asp Asp Ala Trp Arg 
1640 1645 



agt cac ttc ggt ttc 
Ser His Phe Gly Phe 
1650 



gac gag ccg 
Asp Glu Pro 



gcc ttg 
Ala Leu 



tec ctg gag gac age gtc gaa gtc ate gcg acg cga ctg 
Ser Leu Glu Asp Ser Val Glu Val He Ala Thr Arg Leu 

1660 1£65 



age cag age ggc aag ttc gat cat eta gtc tgg ate gtg ccg ata 
Gly Gin Ser Gly Lys Phe Asp His Leu Val Trp He Val Pro He 
1670 1675 1680 



gcc gag agt gaa acc gat att gca gcg caa ggt tea 
Ala Glu Ser Glu Thr Asp He Ala Ala Gin Gly Ser 
1685 1690 1695 



gcg gcg ate 
Ala Ala He 



gcc ggt ttc egg ttg gtc aag 
Ala Gly Phe Arg Leu Val Lys 
1700 1705 



gcg ttg ctt gcg ttg ggc tat gcg 
Ala Leu Leu Ala Leu Gly Tyr Ala 
1710 



cat cgc ccg ctg ggt etc acc gtg ctg act cgc caa 
His Arg Pro Leu Gly Leu Thr Val Leu Thr Arg Gin 
1715 1720 1725 



egg cag ccg teg cac gcg gca 
Arg Gin Pro Ser His Ala Ala 
1730 1735 



gtg cac ggg ctg ate 
Val His Gly Leu He 
1740 



gcc ctt acg 
Ala Leu Thr 



ggg acg ctg 
Gly Thr Leu 



gcc aag gaa tac tgc aac tgg aaa ate cgt ctg etc gac ctg ccg 
Ala Lys Glu Tyr Cys Asn Trp Lys He Arg Leu Leu Asp Leu Pro 
1745 1750 1755 



age gta aaa tct tgg ccg caa 
Ser Val Lys Ser Trp Pro Gin 



tgg gag caa ttg egg 
Trp Glu Gin Leu Arg 
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teg ttg cct 
Ser Leu Pro 
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1760 1765 1770 

tgg cat gcg cag ggc gaa gcc ctg ate ggc cgt ggg act tgt tgg 5364 
Trp His Ala Gin Gly Glu Ala Leu He Gly Arg Gly Thr Cys Trp 
1775 1780 1785 

tat egg egg cag ttg tgt gaa gtg ctg ccg ctg ccg teg ttg gaa 5409 
Tyr Arg Arg Gin Leu Cys Glu Val Leu Pro Leu Pro Ser Leu Glu 
1790 1795 1800 

ccg ccg ccg tac cgc gta ggc ggt gtc tac gtc gtg ate ggc ggc 5454 
Pro Pro Pro Tyr Arg Val Gly Gly Val Tyr Val Val He Gly Gly 
1805 1810 1815 

get ggc ggc ttg ggt gaa gta ttg age gaa cac ttg ate cgc acg 5499 
Ala Gly Gly Leu Gly Glu Val Leu Ser Glu His Leu He Arg Thr 
1820 1825 1830 

tac gac gcg cag ctg ate tgg ate ggg egg cgc gtg ctg gac gaa 5544 
Tyr Asp Ala Gin Leu He Trp He Gly Arg Arg Val Leu Asp Glu 
1835 1840 1845 

ggc att gcg cgc aag cag ace egg ett gcg teg ctg ggc cgc gca 558 9 

Gly He Ala Arg Lys Gin Thr Arg Leu Ala Ser Leu Gly Arg Ala 
1850 1855 I860 

ccg cat tac ate tec gcg gac gcg agt gac ccg get gcc ctg cag 5634 
Pro His Tyr He Ser Ala Asp Ala Ser Asp Pro Ala Ala Leu Gin 
1865 1870 1875 

gcg gca cat aat gag ate gtt gcg ctg cat ggc cag ccc cat ggg 5679 
Ala Ala His Asn Glu He Val Ala Leu His Gly Gin Pro His Gly 
1880 1885 1890 

etc ate eta age aac ate gtg ctg aag gat gcc agt ctg get cgt 5724 

Leu He Leu Ssr Asf. X±e Val Leu Lye &ep rJ - Ser beu -M?-^ - 
1895 1900 1905 

atg gag gaa gcc gat ttc cgt gac gtg ctg gcc gcg aaa etc gac 5769 
Met Glu Glu Ala Asp Phe Arg Asp Val Leu Ala Ala Lys Leu Asp 
1910 1915 1920 

gtc age gtg tgt gcg gca cag gtg ttc ggc acg gcc ccc ctt gat 5814 
Val Ser Val Cys Ala Ala Gin Val Phe Gly Thr Ala Pro Leu Asp 
1925 1930 1935 

ttc gtg ctg ttt ttt tct tec ate cag age act ace aag gcg gcc 5859 
Phe Val Leu Phe Phe Ser Ser He Gin Ser Thr Thr Lys Ala Ala 
1940 1945 1950 

ggg caa ggt aac tac gcc gcc ggc tgc tgc tat gtc gac get ttc 5904 
Gly Gin Gly Asn Tyr Ala Ala Gly Cys Cys Tyr Val Asp Ala Phe 
1955 I960 1965 

ggc gag eta tgg gcg cgc egg ggt ttg agg gta aag ace ate aac 5949 
Gly Glu Leu Trp Ala Arg Arg Gly Leu Arg Val Lys Thr He Asn 
1970 1975 1980 

tgg ggc tac tgg ggc age gtg ggc gtc gta gcg ggc gag gac tat 5994 
Trp Gly Tyr Trp Gly Ser Val Gly Val Val Ala Gly Glu Asp Tyr 
1985 1990 1995 



cgc egg cgc atg gcg caa aaa cac atg get teg att gag ggt gcc 
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Arg Arg Arg Met Ala Gin Lys His Met Ala Ser He Glu Gly Ala 

2000 2005 2010 

gaa gcg atg cag gtg ttg teg cag ttg ttg tgt gcg ccg ttg caa 

Glu Ala Met Gin Val Leu Ser Gin Leu Leu Cys Ala Pro Leu Gin 

2015 2020 2025 



egg ctt gec tac gtc aag ate gac gat get aac gca atg cgc get 
Arg Leu Ala Tyr Val Lys He Asp Asp Ala Asn Ala^ Met Arg Ala 
2030 2035 



2040 



ctg ggc gta gta gag gac gag age gtg caa ate cct gtg cac gca 
Leu Gly Val Val Glu Asp Glu Ser Val Gin He Pro Val His Ala 
2045 2050 2055 



ccg gec gag cct ccc aga ggg 
Pro Ala Glu Pro Pro Arg Gly 

2060 2065 
teg gtg aat ctg gat gec egg 
Ser Val Asn Leu Asp Ala Arg 

2075 2080 



cag cct ggt ccc gtg 
Gin Pro Gly Pro Val 
2070 

cgc gaa egg gaa act 
Arg Glu Arg Glu Thr 
2085 



gtc gag ttg 
Val Glu Leu 

ttg ctg gcg 
Leu Leu Ala 



gec tgg ctg ctt gag ttg ate gag caa etc ggt ggt ttt ccg ccg 
Ala Trp Leu Leu Glu Leu He Glu Gin Leu Gly Gly Phe Pro Pro 
2090 2095 2100 



gca agt ttc gac ate get acg ctt gcg caa cgc ctg 
Ala Ser Phe Asp He Ala Thr Leu Ala Gin Arg Leu 
2105 2110 2115 



cac ate gta 
His He Val 



ccc gec tat cga age tgg ctg gaa cac age gtg egg atg etc ggc 
Pro Ala Tyr Arg Ser Trp Leu Glu His Ser Val Arg Met Leu Gly 
2120 2125 2130 



gta tat gat tac etc aga gcg acg ggg gaa age cga 

Val Tyr Gly Tyr Leu -ftrg Ala Thr Giy, Giu Ser Arg 



ttc qag ctg 

Phe Ciiu -Lex** 



2135 



2140 



2145 



gec gac aag ccg ccc gat gat gec agg ggt gec tgg aac gcg cat 
Ala Asp Lys Pro Pro Asp Asp Ala Arg Gly Ala Trp Asn Ala His 
2150 2155 2160 



gtg cac 
Val His 
2165 



gag gec age gtc gaa gec ggt gaa gag gca cag egg cgt 
Glu Ala Ser Val Glu Ala Gly Glu Glu Ala Gin Arg Arg 
2170 2175 



ctg etc gat cgc tgc atg egg gcg ttg ccg gcg gtc ctt cga ggc 
Leu Leu Asp Arg Cys Met Arg Ala Leu Pro Ala Val Leu Arg Gly 
2180 2185 2190 

gaa cgc aag gee ace gaa ttg ctg ttt ccg gaa ggt teg atg gcg 
Glu Arg Lys Ala Thr Glu Leu Leu Phe Pro Glu Gly Ser Met Ala 
2195 2200 2205 

tgg gtc gag ggt ate tac cag aac aac ccg ctt gee gat tac ttc 
Trp Val Glu Gly He Tyr Gin Asn Asn Pro Leu Ala Asp Tyr Phe 
2210 2215 2220 

aac gca caa eta gtc acg cga ctg att gee tac ttg aga cga cga 
Asn Ala Gin Leu val Thr Arg Leu He Ala Tyr Leu Arg Arg Arg 
2225 2230 2235 

eta gag teg acg cct acg gcg cgc ctg aag ctg tgc gag ate ggc 
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Leu Glu Ser Thr Pro Thr Ala Arg Leu Lys Leu Cys Glu lie Gly 
2240 2245 2250 



cag ttg cag 
Gin Leu Gin 



gcc ggc age ggt ggt act act gca age gtg eta caa 
Ala Gly Ser Gly Gly Thr Thr Ala Ser Val Leu Gin 
2255 2260 2265 

aca tat ggt gag cat att gag gaa tat etc tat ace gac ctg teg 
Sa Sr G?y 111 His He Glu Glu Tyr Leu Tyr Thr Asp Leu Ser 
2270 2275 2280 



cca cga gcg 
Pro Arg Ala 



cct gtc ttc ctg cat cat gcg gaa aaa cac tat cag 

Pro Val Phe Leu His His Ala Glu Lys His Tyr Gin 
2285 2290 2295 

cct tat ttg agg acc gcc tgt ttc gac gta gcg cgc 

Pro Tyr Leu Arg Thr Ala Cys Phe Asp Val Ala Arg 
2300 2305 2310 

gcg cag gcc ctg gaa tct ggc ggc tac gac gtg gtg att gcc gcc 

III Gin Ala Leu Glu Ser Gly Gly Tyr Asp Val Val lie Ala Ala 
2315 2320 2325 



gcg ccg acg 
Ala Pro Thr 



aac gta ctg cat get acg cgc 
Asn Val Leu His Ala Thr Arg 
2330 2335 



gat ate gcc aag acc ttg cgc aat 
Asp He Ala Lys Thr Leu Arg Asn 
2340 



etc aac gaa 
Leu Asn Glu 



gcg aag gca etc etc aaa cct ggc ggt ctg etc ttg 
Ala Lys Ala Leu Leu Lys Pro Gly Gly Leu Leu Leu 
2345 2350 2355 

gtg ate gag cgc age etc gtc ttg cac ctg act ttc ggt ctg ctg 
va? lie Glu Arg Ser Leu Val Leu His Leu Thr Phe Gly Leu Leu 
2360 2365 2370 

gag age tgg t^g ttg ccc cc^ gac 2=3 ate tt S cgc etc gcc 
III sir Trp t£ Leu Pro Gin Asp Lys lie Leu Arg Leu Ala Gly 
2375 2380 2385 



teg ccg ttg ctg get tgc gcc acc tgg cgc age ctg 
Ser Pro Leu Leu Ala Cys Ala Thr Trp Arg Ser Leu 
2390 2395 2400 



ctg gag get 
Leu Glu Ala 



gag ggt ttt gcg ggg ctg age 
Glu Gly Phe Ala Gly Leu Ser 
2405 2410 

ggg cag gcc ate ate tgt gcc 
Gly Gin Ala He He Cys Ala 
2420 2425 



gtg cac agg gcg caa ccc gat gcc 
Val His Arg Ala Gin Pro Asp Ala 
2415 

tac age gat ggg ata gtg egg caa 
Tyr Ser Asp Gly He Val Arg Gin 
2430 



gcc agt acg ate gag gtt gcg egg aat gaa aaa gta 
Ala Ser Thr He Glu Val Ala Arg Asn Glu Lys Val 
2435 2440 2445 



acc gtt ccg 
Thr Val Pro 



teg cag ccg gcg gaa gcc ggg 
Ser Gin Pro Ala Glu Ala Gly 
2450 2455 



gaa teg ccg ctg gat ctg gtc aaa 
Glu Ser Pro Leu Asp Leu Val Lys 

2460 



aaa ctg ctt gga cgc att ctg aaa atg gat ccg gcc 
Lys Leu Leu Gly Arg He Leu Lys Met Asp Pro Ala 
2465 2470 2475 



aca etc gat 
Thr Leu Asp 
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acc age cac ccg ctg gag tac 
Thr Ser His Pro Leu Glu Tyr 
2480 2485 



tac ggt gtc gat teg ate gtg gcg 
Tyr Gly Val Asp Ser He Val Ala 
2490 



ate gaa ctg get atg gca ctg cgc gag aca ttc ccg 
He Glu Leu Ala Met Ala Leu Arg Glu Thr Phe Pro 
2495 2500 2505 



gtc age gag ctg ttt gaa acg 
Val Ser Glu Leu Phe Glu Thr 
2510 2515 



caa tec ate gat acc 
Gin Ser He Asp Thr 
2520 



tct ctt gag cag get cct etc ctt get acc etc aca 
Ser Leu Glu Gin Ala Pro Leu Leu Ala Thr Leu Thr 
2525 2530 2535 



ggt ttt gaa 
Gly Phe Glu 



ttg ttg ggc 
Leu Leu Gly 



get ccg ccg 
Ala Pro Pro 



caa caa gac atg ctg cag cag ctg aaa caa ctg ctg gcg cgt acg 
Gin Gin Asp Met Leu Gin Gin Leu Lys Gin Leu Leu Ala Arg Thr 
2540 2545 2550 



ctg aag ctg gac att acg cag ate gac acg age aag 
Leu Lys Leu Asp He Thr Gin He Asp Thr Ser Lys 
2555 2560 2565 



age tat ggt gtc gac tec ate 
Ser Tyr Gly Val Asp Ser He 
2570 2575 



gtc ate ate gaa tta 
Val He He Glu Leu 
2580 



acg ctg gag 
Thr Leu Glu 



gec aac gee 
Ala Asn Ala 



ttg cgt gag cgc tat ccg age ttg gac gcg tea cag 
Leu Arg Glu Arg Tyr Pro Ser Leu Asp Ala Ser Gin 
2585 2590 2595 

acc tta teg ate gac egg ctg gtt gee caa tgg cag 
Thr Leu Ser He Asp Arg Leu Val Ala Gin Trp Gin 
2600 2605 2610 

ccc gee gta ccg gca gag cca aca gcg gaa ccg ccg 
Pro Ala Val Pro Ala Glu Pro Thr Ala Glu Pro Pro 
2615 2620 2625 



gaa gac gec get gee ate ate 
Glu Asp Ala Ala Ala He He 
2630 2635 



gga ctg gee ggc cgc 
Gly Leu Ala Gly Arg 
2640 



ctg atg gaa 
Leu Met Glu 



gca acg gag 
Ala Thr Glu 



gta gee gac 
Val Ala Asp 



ttt cca ggc 
Phe Pro Gly 



gcg gac acg ttg gag gag ttc tgg aac aac ctg cgc 
Ala Asp Thr Leu Glu Glu Phe Trp Asn Asn Leu Arg 
2645 2650 2655 

age agt atg gga gag gtg cca ggc gag cgc tgg gat 
Ser Ser Met Gly Glu Val Pro Gly Glu Arg Trp Asp 
2660 2665 2670 

tac ttc gac agt gaa cgc cag gca ccg ggc aag acg 
Tyr Phe Asp Ser Glu Arg Gin Ala Pro Gly Lys Thr 
2675 2680 2685 

tgg ggt gcg ttt ctg agg gac ata gac ggc ttc gat 
Trp Gly Ala Phe Leu Arg Asp He Asp Gly Phe Asp 
2690 2695 2700 

ttt gaa tgg ccc gac age gtc gcg ctg gaa teg gat 
Phe Glu Trp Pro Asp Ser Val Ala Leu Glu Ser Asp 
2705 2710 2715 



aac ggc caa 
Asn Gly Gin 



cac cag cac 
His Gin His 



tat age cgc 
Tyr Ser Arg 



gca gee 
Ala Ala 



ttc 
Phe 



ccg caa gcg 
Pro Gin Ala 
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egg ata ttt eta gag cag gee 
Arg He Phe Leu Glu Gin Ala 
2720 2725 



tat gee ggg ate gaa 
Tyr Ala Gly He Glu 
2730 



tac acg cct ggc teg etc age aag age caa cgc gta 
Tyr Thr Pro Gly Ser Leu Ser Lys Ser Gin Arg Val 
2735 2740 2745 



gta ggt gtg atg aat ggt tac 
Val Gly Val Met Asn Gly Tyr 

2750 2755 
caa ate gec aac cgc gtg teg 
Gin He Ala Asn Arg Val Ser 

2765 2770 



tac age ggc gga gcg 
Tyr Ser Gly Gly Ala 
2760 

tac cag ttc gat ttt 
Tyr Gin Phe Asp Phe 
2775 



gat gec ggc 
Asp Ala Gly 



ggt gta ttc 
Gly Val Phe 



cgc ttc tgg 
Arg Phe Trp 

cgc ggg cca 
Arg Gly Pro 



age ctg gcg gtg gat ace gee tgt teg get teg etc acc gcg ate 
Ser Leu Ala Val Asp Thr Ala Cys Ser Ala Ser Leu Thr Ala He 
2780 2785 2790 

cac ctg gcg ctg gaa age ctg cgc age ggc agt tgc gag gtc gca 
His Leu Ala Leu Glu Ser Leu Arg Ser Gly Ser Cys Glu Val Ala 
2795 2800 2805 

ctg gee ggt ggc gtg aat ctg ctg gtc gat ccg cag caa tat ctt 
Leu Ala Gly Gly Val Asn Leu Leu Val Asp Pro Gin Gin Tyr Leu 
2810 2815 2820 

aat ttg get ggc gee gcg atg etc tec gec ggc gee age tgt egg 
Asn Leu Ala Gly Ala Ala Met Leu Ser Ala Gly Ala Ser Cys Arg 
2825 2830 2835 



ccg ttc ggc gag gee gcg gac 
Pro Phe Gly Glu Ala Ala Asp 
28A0 2845 



ggt ttc gtg gee ggc gaa gee tgc 
Gly Phe Val Ala Gly Glu Ala Cys 
2850 



ggc gtg gtg ctg etc aag ccg etc aag caa gcg agg gec gat ggc 
Gly Val Val Leu Leu Lys Pro Leu Lys Gin Ala Arg Ala Asp Gly 
2855 2860 2865 

gat gtg ate cat gee gta ate agg ggc age atg ate aat gee ggt 
Asp Val He His Ala Val He Arg Gly Ser Met He Asn Ala Gly 
2870 2875 2880 



ggg cac acc age gcg ttc tec teg cct aac cct gee 
Gly His Thr Ser Ala Phe Ser Ser Pro Asn Pro Ala 
2885 2890 2895 



gaa gtc gtg egg cag gec ttg 
Glu Val Val Arg Gin Ala Leu 
2900 2905 



cag cgc gcg ggc gtg 
Gin Arg Ala Gly Val 
2910 



gee cag gee 
Ala Gin Ala 



gcg ccc gat 
Ala Pro Asp 



teg ate age tac ate gag gcg cat ggc acc ggc acc gta eta ggc 
Ser He Ser Tyr He Glu Ala His Gly Thr Gly Thr Val Leu Gly 
2915 2920 2925 

gat gca gtg gag ttg ggt get ttg aat aaa gtg ttc 
Asp Ala Val Glu Leu Gly Ala Leu Asn Lys Val Phe 
2930 2935 2940 

gcg gcg cca tgc ccg ate ggc teg ctg aag gcg aac 
Ala Ala Pro Cys Pro He Gly Ser Leu Lys Ala Asn 
2945 2950 2955 



gac aag cgc 
Asp Lys Arg 



ate ggc cat 
He Gly His 
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gcc gaa age gec gcg ggc ate gee ggc ctg gec aag ctg gta ttg 

Ala Glu Ser Ala Ala Gly lie Ala Gly Leu Ala Lys Leu Val Leu 
2960 2965 2970 

cag ttc agg cat ggc gag ttg gtg cct agt ctg aat gcg ttt ccc 

Gin Phe Arg His Gly Glu Leu Val Pro Ser Leu Asn Ala Phe Pro 
2975 2980 2985 



8919 



8964 



ttg aat ccc tat att gag ttc 
Leu Asn Pro Tyr lie Glu Phe 
2990 2995 



ggt cgc ttc cag gta 
Gly Arg Phe Gin Val 
3000 



caa cag cag 
Gin Gin Gin 



9009 



ccg gca ccg tgg ccg cgc cgt ggc gcc cag ccg egg 
Pro Ala Pro Trp Pro Arg Arg Gly Ala Gin Pro Arg 
3005 3010 3015 



cgc gcc ggg 
Arg Ala Gly 



tta tct gcc ttc ggt get ggc gga teg aat gcg cac eta gtg gta 
Leu Ser Ala Phe Gly Ala Gly Gly Ser Asn Ala His Leu Val Val 
3020 3025 3030 

gag gaa get ccg get atg get ccc ggg gtc teg ate age gcc age 
Glu Glu Ala Pro Ala Met Ala Pro Gly Val Ser lie Ser Ala Ser 
3035 3040 3045 

tct cca gcc ttg ate gtg ctt teg gcg cga acg ctg cct gcc ttg 
Ser Pro Ala Leu lie Val Leu Ser Ala Arg Thr Leu Pro Ala Leu 
3050 3055 3060 

caa cag cgt get cgc gat ctg etc gtc tgg atg caa gcg egg cag 
Gin Gin Arg Ala Arg Asp Leu Leu Val Trp Met Gin Ala Arg Gin 
3065 3070 3075 



gtg gat gac gtc atg ctg gcc gac gtt get tat acg 
Val Asp Asp Val Met Lei' Ala Asp Val Ala Tyr Thr 

2030 3 0ff5 j 0^0 

ggc cgc gtc gcg atg gag caa cgc ctg get ttt acc 
Gly Arg Val Ala Met Glu Gin Arg Leu Ala Phe Thr 
3095 3100 3105 



ctg cac ttg 
Leu His Leu 



get ggc teg 
Ala Gly Ser 



9054 



9099 



9144 



9189 



9234 



9279 



9324 



get gcc gag ttg age gag aaa tta cag get tac ctg ggc cat gcg 
Ala Ala Glu Leu Ser Glu Lys Leu Gin Ala Tyr Leu Gly His Ala 
3110 3115 3120 



9369 



att egg gcc gac ate tat ctg 

He Arg Ala Asp He Tyr Leu 
3125 3130 

gca ggc get ccg ate gtg gcc 

Ala Gly Ala Pro He Val Ala 
3140 3145 



age gag gac acg ccc 
Ser Glu Asp Thr Pro 
3135 

gag gaa gat ctg etc 
Glu Glu Asp Leu Leu 
3150 



ggc aaa ccg 
Gly Lys Pro 



acg ctg atg 
Thr Leu Met 



9414 



9459 



gat gcc tgg ate gaa aag ggc cag tac ggt cgt ttg 
Asp Ala Trp He Glu Lys Gly Gin Tyr Gly Arg Leu 
3155 3160 3165 



ctg gag tac 
Leu Glu Tyr 



9504 



tgg acc aag ggc caa ccg ate 
Trp Thr Lys Gly Gin Pro He 
3170 " 3175 



gac tgg aac aaa etc tat tgg cgc 
Asp Trp Asn Lys Leu Tyr Trp Arg 
3180 



9549 



aag ctg tat gcg gac gga egg 
Lys Leu Tyr Ala Asp Gly Arg 



ccg egg egg ate age 
Pro Arg Arg He Ser 



ctg ccc acc 
Leu Pro Thr 



9594 
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3185 



3190 



3195 



tat ccg ttc gag cac egg cgt tat tgg caa acg ccg gtg ccg ggc 
Tyr Pro Phe Glu His Arg Arg Tyr Trp Gin Thr Pro val Pro Gly 
3200 3205 3210 

gag cga age ctg cac gec acc gcg cca get act egg gaa acg gtt 
Glu Arg Ser Leu His Ala Thr Ala Pro Ala Thr Arg Glu Thr Val 
3215 3220 3225 

gcg gtt ggt gec atg ccg gat ccg gec ggc get acg gtg caa gec 
Ala Val Gly Ala Met Pro Asp Pro Ala Gly Ala Thr Val Gin Ala 
3230 3235 3240 



egg ttg tgc gec ttg tgc caa gtg ttg ttg ggc aaa 
Arg Leu Cys Ala Leu Cys Gin Val Leu Leu Gly Lys 
3245 3250 3255 



ccg gtc acg 
Pro Val Thr 



gee cag atg gat ttc ttt gee 
Ala Gin Met Asp Phe Phe Ala 
3260 3265 



gtc ggc ggc cat teg gtg ctg gcg 
Val Gly Gly His Ser Val Leu Ala 
3270 



ate caa ttg gtc teg cgc ate cgc aaa age ttc ggg gtg gag tat 
lie Gin Leu Val Ser Arg He Arg Lys Ser Phe Gly Val Glu Tyr 
3275 3280 3285 

ccg gtc age get ttg ttc gaa teg gcg ctg ttg teg gac atg gcg 
Pro Val Ser Ala Leu Phe Glu Ser Ala Leu Leu Ser Asp Met Ala 
3290 3295 3300 

egg cag ate gaa caa ttg egg gtg aac gga gtc gec aag cgc atg 
Arg Gin He Glu Gin Leu Arg Val Asn Gly Val Ala Lys Arg Met 
3305 3310 3315 

ccg gcg ttg ttg cct gec ggg cqc gtg ggc gcg att cct gcg act 

l~g Ale Leu Leu Pre Ma Gly Arg Val Gly Ala lie Sre.AlP Thr 
3320 3325 3330 

tat gca cag gag cgc eta tgg etc gtc cac gaa cat atg agt gag 
Tyr Ala Gin Glu Arg Leu Trp Leu Val His Glu His Met Ser Glu 
3335 3340 3345 

caa cgc agt agt tac aac ate acc ttt gee atg cac ttc aga ggc 
Gin Arg Ser Ser Tyr Asn He Thr Phe Ala Met His Phe Arg Gly 
3350 3355 3360 

gtc gac ttc cgt get gaa gcg atg cgt gee gca ttg aac gcg ctg 
Val Asp Phe Arg Ala Glu Ala Met Arg Ala Ala Leu Asn Ala Leu 
3365 3370 3375 

gtg gtg egg cac gaa gtg ctg cgc aca cgc ttt ctt teg gag gac 
Val Val Arg His Glu Val Leu Arg Thr Arg Phe Leu Ser Glu Asp 
3380 3385 3390 

ggg cag ctg caa cag gtg ate get gec teg ttg acg ttg gag gtg 
Gly Gin Leu Gin Gin Val He Ala Ala Ser Leu Thr Leu Glu Val 
3395 3400 3405 



ccg gta aga gag atg teg gtc 
Pro Val Arg Glu Met Ser Val 
3410 3415 



gag gag gtc gac ctg ctg ctg gec 
Glu Glu Val Asp Leu Leu Leu Ala 
3420 



gcg age acg egg gag act ttc gat ctg egg cag ggg ccc ttg ttc 
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Ala Ser Thr Arg Glu Thr Phe Asp Leu Arg Gin Gly 
3425 3430 3435 

aag gca cgc ate ctg cgc gtg gcg gec gat cac cat 
Lys Ala Arg lie Leu Arg Val Ala Ala Asp His His 

3440 3445 3450 

age age ate cac cac ate att tec gac ggc tgg teg 
Ser Ser He His His He He Ser Asp Gly Trp Ser 

3455 3460 3465 



ttc aac cgt gac ctg cac cag 
Phe Asn Arg Asp Leu His Gin 
3470 3475 



ctg tac gag gcg tgt 
Leu Tyr Glu Ala Cys 
3480 



acg ccc ccc aca ctg ccg acg ctg gcg gtg cag tat 
Thr Pro Pro Thr Leu Pro Thr Leu Ala Val Gin Tyr 
3485 3490 3495 



Pro Leu Phe 



gtg gtg ttg 
Val Val Leu 

ctg gga gtg 
Leu Gly Val 



ttg cgc ggc 
Leu Arg Gly 



gee gac tac 
Ala Asp Tyr 



gcg ctg tgg caa egg caa tgg gag ctg gcg get ccg ctg teg tac 
Ala Leu Trp Gin Arg Gin Trp Glu Leu Ala Ala Pro Leu Ser Tyr 
3500 3505 3510 

tgg acg egg gca ctg gaa ggc tac gac gac ggc ctg gac ttg ccc 
Trp Thr Arg Ala Leu Glu Gly Tyr Asp Asp Gly Leu Asp Leu Pro 
3515 3520 3525 



tac gac egg ccg cgc ggc gee 
Tyr Asp Arg Pro Arg Gly Ala 
3530 3535 



acg egg gcg tgg egg gca ggg ctg 
Thr Arg Ala Trp Arg Ala Gly Leu 
3540 



gtc aaa cac cgc tat ccg ccg caa ctg gee cag cag ttg gcg gee 
Val Lys His Arg Tyr Pro Pro Gin Leu Ala Gin Gin Leu Ala Ala 
3545 3550 3555 



ac age caa cag tac caa gcg acg ctg ttc atg age ctg ctg gca 

yr Ser - Gin Gin Tyr Gin Ala Thr Lev Phe Met Ser Leu Leu Aia- 
3560 3565 3570 



tac 

T 



ggc ctg gcg ttg gtg ctg ggc cgt tac gee gat cgc aag gac gtg 
Gly Leu Ala Leu Val Leu Gly Arg Tyr Ala Asp Arg Lys Asp Val 
3575 3580 3585 



tgc ate ggc gcg acg gtc tec 
Cys He Gly Ala Thr Val Ser 
3590 3595 



ggc cgc gac cag ctg gag ctg gaa 
Gly Arg Asp Gin Leu Glu Leu Glu 
3600 



gag ctg ate ggc ttt ttc ate aat att ttg ccg ctg 
Glu Leu He Gly Phe Phe He Asn He Leu Pro Leu 
3605 3610 3615 



ctg teg ggg gat ccg tgc ctg 
Leu Ser Gly Asp Pro Cys Leu 
3620 3625 



gag gag gtg ctg ctg 
Glu Glu Val Leu Leu 
3630 



caa gtg gta ctg gat ggc ttc gcg cac cag teg gtg 
Gin Val Val Leu Asp Gly Phe Ala His Gin Ser Val 
3635 3640 3645 



egg gtg gac 
Arg Val Asp 



cgc acg cgt 
Arg Thr Arg 



ccg ttc gag 
Pro Phe Glu 



cac gtg ttg cag gcg ctg egg 
His Val Leu Gin Ala Leu Arg 
3650 3655 



cgt cag cgc gac agt 
Arg Gin Arg Asp Ser 
3660 



age cag ate 
Ser Gin He 



ccg ctg gtg ccg gtg atg ctg cga cac cag aac ttc ccg acg cag 
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Pro Leu Val Pro Val Met Leu Arg His Gin Asn Phe Pro Thr Gin 
3665 3670 3675 



gag att ggc gat tgg ccc gag 
Glu lie Gly Asp Trp Pro Glu 
3680 3685 



gga gtg egg ctg acg 
Gly Val Arg Leu Thr 
3690 



ctg ggg ctg gac cgt age acg ccg age gag ctg gat 
Leu Gly Leu Asp Arg Ser Thr Pro Ser Glu Leu Asp 
3695 3700 3705 



tac ggc gac ggc age teg ctg 
Tyr Gly Asp Gly Ser Ser Leu 
3710 3715 



gag ctg acg ctg gaa 
Glu Leu Thr Leu Glu 
3720 



cag atg gag 
Gin Met Glu 



tgg cag ttc 
Trp Gin Phe 



tac gcg cag 
Tyr Ala Gin 



gac etc ttc gac gaa gcg acg gtg egg egg atg ate gca cac cac 
Asp Leu Phe Asp Glu Ala Thr Val Arg Arg Met He Ala His His 
3725 3730 3735 

cag cag gcg ttg gag gcg atg gtg age egg cca cag ctg egg gtg 
Gin Gin Ala Leu Glu Ala Met Val Ser Arg Pro Gin Leu Arg Val 
3740 3745 3750 



ggc aag tgg gac atg ctg acg 
Gly Lys Trp Asp Met Leu Thr 
3755 3760 



gee gaa gag cgc egg 
Ala Glu Glu Arg Arg 
3765 



ctg ttt gec 
Leu Phe Ala 



gcg eta aat gcg aca ggt acg cca egg gag tgg ccc agt ctg gcg 
Ala Leu Asn Ala Thr Gly Thr Pro Arg Glu Trp Pro Ser Leu Ala 
3770 3775 3780 

cag cag ttc gaa egg cag gcg cag gcg acg ccg cag gec ata gca 
Gin Gin Phe Glu Arg Gin Ala Gin Ala Thr Pro Gin Ala He Ala 
3785 3790 3795 



11079 



11124 



11169 



11214 



11259 



11304 



11349 



11394 



,tgc gcg - age gat <jg^c ? a ~cg t*zz »sc -tat gec: cag r*.y Qag.gc^ 
Cys Val Ser Asp Gly Gin Ser Trp Ser Tyr Ala Gin Leu Glu Ala 
3800 3805 3810 



cgc gee aac cag ctg gca cag gcg ctg cgt ggg cag 
Arg Ala Asn Gin Leu Ala Gin Ala Leu Arg Gly Gin 
3815 3820 3825 

egg gac gtg egg gtg gcg gta cag agt gcg cgc acg 
Arg Asp Val Arg Val Ala Val Gin Ser Ala Arg Thr 
3830 3835 3840 



ggc gcg ggc 
Gly Ala Gly 



ccg gaa ctg 
Pro Glu Leu 



ctg atg gee ttg ctg gcg ate ttc aag gec ggt gca tgc tat gtg 
Leu Met Ala Leu Leu Ala He Phe Lys Ala Gly Ala Cys Tyr Val 
3845 3850 3855 

ccg ate gat ccg gee tac ccg gcg gee tac cgc gag cag ate ctg 
Pro He Asp Pro Ala Tyr Pro Ala Ala Tyr Arg Glu Gin He Leu 
3860 3865 3870 



11484 



11529 



11574 



11619 



gec gag gtg cag gtg teg ate gtg ctg gag caa gac gag ctg gcg 
Ala Glu Val Gin Val Ser He Val Leu Glu Gin Asp Glu Leu Ala 
3875 3880 3885 



11664 



ctg gac gag caa ggg cag ttc cac aat ccg cgt tgg cgc gag caa 
Leu Asp Glu Gin Gly Gin Phe His Asn Pro Arg Trp Arg Glu Gin 
3890 3895 3900 



11709 
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gcc ccg acg ccg ctg ggg ctg agg gaa cat ccg ggc 
Ala Pro Thr Pro Leu Gly Leu Arg Glu His Pro Gly 
3905 3910 3915 



tgc gtg atg gtg acc tec ggc 
CyB Val Met Val Thr Ser Gly 
3920 3925 



teg acc ggc egg ccc 
Ser Thr Gly Arg Pro 
3930 



atg gtg ccg tat gcg cag ctg cac aac tgg ctg cat 
Met Val Pro Tyr Ala Gin Leu His Asn Trp Leu His 
3935 3940 3945 



gac ctg gcg 
Asp Leu Ala 



aag ggc gtg 
Lys Gly Val 



gca ggc tgg 
Ala Gly Trp 



cag cgt tct gcg ttc gag gcc 
Gin Arg Ser Ala Phe Glu Ala 
3950 3955 



ggg gag egg gtg ctg cag aag acc 
Gly Glu Arg Val Leu Gin Lys Thr 
3960 



teg ate gcc ttt gcg gtg teg gta aag gag ttg eta age ggg ctg 
Ser lie Ala Phe Ala Val Ser Val Lys Glu Leu Leu Ser Gly Leu 
3965 3970 3975 

ctg gcg ggg gtg gaa cag gtg atg ctg ccg gac gag cag gtg aag 
Leu Ala Gly Val Glu Gin Val Met Leu Pro Asp Glu Gin Val Lys 
3980 3985 3990 

gac age ctg gcg ttg gcg egg gcg att gag caa tgg cag gtg acg 
Asp Ser Leu Ala Leu Ala Arg Ala lie Glu Gin Trp Gin Val Thr 
3995 4000 4005 



egg ctg tac eta gtg cca teg 
Arg Leu Tyr Leu Val Pro Ser 
4010 4015 



cac ctg cag gcg ctg ctg gac gcg 
His Leu Gin Ala Leu Leu Asp Ala 
4020 



acg caa gga cga gac ggg eta ctg cac teg ctg cgt cac gtg gtg 
Thr Gin Gly Arg Asp Gly Leu Leu His Ser Leu Arg His Val Val 
4025 4030 4035 

acg gcg ggg gaa gcg ttg ccg tct gcg gtg cgc gaa acg gtg egg 
Thr Ala Gly Glu Ala Leu Pro Ser Ala Val Arg Glu Thr Val Arg 
4040 4045 4050 

gtg cgt ctg cca cag gtg cag eta tgg aac aac tat ggc tgc acg 
Val Arg Leu Pro Gin Val Gin Leu Trp Asn Asn Tyr Gly Cys Thr 
4055 4060 4065 

gaa ctg aac gac gcg acc tac cat egg teg gat acg gtg gcg cca 
Glu Leu Asn Asp Ala Thr Tyr His Arg Ser Asp Thr Val Ala Pro 
4070 4075 4080 

gga acg ttt gtg ccg ate ggc gca ccg ate gcc aac acc gag gta 
Gly Thr Phe Val Pro lie Gly Ala Pro lie Ala Asn Thr Glu Val 
4085 4090 4095 



tac gtg 
Tyr val 

4100 



ctg gac egg cag ctg egg cag gtg ccg ate 
Leu Asp Arg Gin Leu Arg Gin Val Pro lie 
4105 4110 



ggc gag ctg cac gta cac age 
Gly Glu Leu His Val His Ser 
4115 4120 

aac egg ccg ggg ctg acg gcc 
Asn Arg Pro Gly Leu Thr Ala 
4130 4135 



gtg ggg atg gcg cgc 
Val Gly Met Ala Arg 
4125 

teg cgc ttc ate gcg 
Ser Arg Phe lie Ala 
4140 



ggg gtg atg 

Gly Val Met 



ggc tac tgg 
Gly Tyr Trp 



cac ccg tat 
His Pro Tyr 



11754 



11799 



11844 



11889 



11934 



11979 



12024 



12069 



12114 



12159 



12204 



12249 



12294 



12339 



12384 



12429 
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age gag gag ccg ggc aca egg ctg tac aag acc ggt 
Ser Glu Glu Pro Gly Thr Arg Leu Tyr Lys Thr Gly 
4145 4150 4155 



cgc egg ctg gcg gac ggg acg 
Arg Arg Leu Ala Asp Gly Thr 
4160 4165 

ttc gag gtc aag gtg cgc ggc 
Phe Glu Val Lys Val Arg Gly 
4175 4180 

gag gcg gec ttg egg gcg cag 
Glu Ala Ala Leu Arg Ala Gin 
4190 4195 



ctg gaa tac ctg ggc 
Leu Glu Tyr Leu Gly 
4170 

cac egg gtg gat acg 
His Arg Val Asp Thr 
4185 

ccc gcg gtg gec gag 
Pro Ala Val Ala Glu 
4200 



age ggt cac egg gtg gac ggg gac atg cag ttg gtg 
Ser Gly His Arg Val Asp Gly Asp Met Gin Leu Val 
4205 4210 4215 



gac atg gta 
Asp Met Val 



cga cag gac 
Arg Gin Asp 



egg cag gtg 
Arg Gin Val 



gcg gtg gtg 
Ala val val 



gec tat gtg 
Ala Tyr Val 



gtg gcg cgt gaa ggg cag gca 
Val Ala Arg Glu Gly Gin Ala 
4220 4225 



ccg age gcg ggc gag 
Pro Ser Ala Gly Glu 
4230 



cag ctg teg gcg cag ttg ccg acc tac atg ctg ccg 
Gin Leu Ser Ala Gin Leu Pro Thr Tyr Met Leu Pro 
4235 4240 4245 



cag tgg ctg gag cag ttg ccg 
Gin Trp Leu Glu Gin Leu Pro 
4250 4255 

egg ttg gcg ctg ccg gcg ccg 
Arg Leu Ala Leu Pro Ala Pro 
4265 4270 



egg ctg tec aac ggc 
Arg Leu Ser Asn Gly 
4260 

cag gtg gta cac gcg 
Gin Val Val His Ala 
4275 



ttg aaa caa 
Leu Lys Gin 



acc gtg tac 
Thr Val Tyr 



aag ttg gac 
Lys Leu Asp 



cag gag tac 
Gin Glu Tyr 



gtc gcg cca cgc aac gag gee gag caa egg ctg gcg gca ctg ttt 
Val Ala Pro Arg Asn Glu Ala Glu Gin Arg Leu Ala Ala Leu Phe 
4280 4285 4290 



gec gag gtg ctg egg gtg gag cag gtg ggc ate cac 
Ala Glu Val Leu Arg Val Glu Gin Val Gly He His 
4295 4300 4305 



gac aac ttc 
Asp Asn Phe 



ttc gee ttg ggt ggg cac teg 
Phe Ala Leu Gly Gly His Ser 
4310 4315 



ctg tct gca teg caa ctg ate teg 
Leu Ser Ala Ser Gin Leu He Ser 
4320 



cgc ate cgc caa agt ttt cac gtc gat ctg ccg ctg age egg ate 
Arg He Arg Gin Ser Phe His Val Asp Leu Pro Leu Ser Arg He 
4325 4330 4335 



ttc gag gca ccc acg ate gag 
Phe Glu Ala Pro Thr He Glu 
4340 4345 



ggc ctg gtc agg cag eta gcg ttg 
Gly Leu Val Arg Gin Leu Ala Leu 
4350 



cct agt gaa ggc ggc gtg gee age ate gee agg gta gcg cga aac 
Pro Ser Glu Gly Gly Val Ala Ser He Ala Arg Val Ala Arg Asn 
4355 4360 4365 



egg acg ate cca ttg teg ctg 
Arg Thr He Pro Leu Ser Leu 
4370 4375 



ttc cag gaa cgc ctg tgg ttc gtg 
Phe Gin Glu Arg Leu Trp Phe Val 
4380 



12474 



12519 



12564 



12609 



12654 



12699 



12744 



12789 



12834 



12879 



12924 



12969 



13014 



13059 



13104 



13149 



-64- 



WO 02/024736 



PCT/AU01/01190 



cac caa cac atg cct gag caa cgc acc agt tac aac 
His Gin His Met Pro Glu Gin Arg Thr Ser Tyr Asn 
4385 4390 4395 



gcc ttg cgt ttg cgt ggt cct 
Ala Leu Arg Leu Arg Gly Pro 
4400 4405 



ttg teg gtg gaa gcg 
Leu Ser Val Glu Ala 
4410 



gcg ctg cgt gcg tta gtg ctg cgc cac gaa ate ttg 
Ala Leu Arg Ala Leu Val Leu Arg His Glu He Leu 
4415 4420 4425 



ggc acg etc 
Gly Thr Leu 



atg cgt gca 
Met Arg Ala 



cgt acc cgc 
Arg Thr Arg 



ttc gtg ttg ccg acc ggt get 
Phe Val Leu Pro Thr Gly Ala 
4430 4435 



age gag ccg gtg cag gtc att gac 
Ser Glu Pro Val Gin Val He Asp 
4440 



gag cac age gat ttc cag etc tea gta cag eta gtc gag gat act 
Glu His Ser Asp Phe Gin Leu Ser Val Gin Leu Val Glu Asp Thr 
4445 4450 4455 



gag ate gcg teg ctg atg gat 
Glu He Ala Ser Leu Met Asp 
4460 4465 



gaa ctg gca agt cat ate tac gac 
Glu Leu Ala Ser His He Tyr Asp 
4470 



tta gcc aac ggc ccg ctg ttc att gca tgc ctt ttg caa ctg gat 
Leu Ala Asn Gly Pro Leu Phe He Ala Cys Leu Leu Gin Leu Asp 
4475 4480 4485 



gag caa gaa cat gtg ctg eta 
Glu Gin Glu His Val Leu Leu 
4490 4495 



ate ggc atg cat cac ctt ate tac 
He Gly Met His His Leu He Tyr 
4500 



gac get tgg teg caa ttc acc gtg atg aac cgc gat eta cgc gtg 
Asp Ala Trp Ser Gin Phe Thr Val Met Asn Arg Asp Leu Arg Val 



-/ V* -» 



4510 



ctg tat cac cgc cac etc gga 
Leu Tyr His Arg His Leu Gly 
4520 4525 



ctt gcc ggc gga gat 
Leu Ala Gly Gly Asp 
4530 



ctg ccg gaa 
Leu Pro Glu 



tta ccg ate caa tat gcc gac tat gcg ate tgg caa cgc gcc cag 
Leu Pro He Gin Tyr Ala Asp Tyr Ala He Trp Gin Arg Ala Gin 
4540 4545 



4535 



aac ctg gac gcg caa ctg gcc tat tgg cag get atg 
Asn Leu Asp Ala Gin Leu Ala Tyr Trp Gin Ala Met 
4550 4555 4560 

tac gac gac ggc ctg gag ctg ccc tac gac tat ccg 
Tyr Asp Asp Gly Leu Glu Leu Pro Tyr Asp Tyr Pro 
4565 4570 4575 

aat cgc acc tgg cac gca gcg gtc tac aca cac acc 
Asn Arg Thr Trp His Ala Ala Val Tyr Thr His Thr 
4580 4585 4590 

gaa ctg gta cag cgc ttt gcc ggc ttc gta cag gcg 
Glu Leu val Gin Arg Phe Ala Gly Phe Val Gin Ala 
4595 4600 4605 



acc ttg ttc ate ggg ctg ttg 
Thr Leu Phe He Gly Leu Leu 



gcc age ttc gcg gtc 
Ala Ser Phe Ala Val 



ttg cac gac 
Leu His Asp 



cgt ccg cgc 
Arg Pro Arg 



tat ccg get 
Tyr Pro Ala 



cat cag teg 
His Gin Ser 



gtg ttg aac 
Val Leu Asn 



13194 



13239 



13284 



13329 



13374 



13419 



13464 



13509 



13554 



13599 



13644 



13689 



13734 



13779 



13824 



13869 
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4610 



4615 



4620 



aaa tac acc ggc egg gac gac ttg tgc ate ggt ace 
Lys Tyr Thr Gly Arg Asp Asp Leu Cys He Gly Thr 
4625 «30 4635 

ggg cgc acg cac ctg gag ctg gag aac ctg ate ggt 
Gly Arg Thr His Leu Glu Leu Glu Asn Leu lie Gly 
4640 «45 4650 

aac ate ttg cct ttg cgc ttg cgc ttg gac ggc gat 
Asn He Leu Pro Leu Arg Leu Arg Leu Asp Gly Asp 
4655 4660 4665 



acc acg gca 
Thr Thr Ala 



ttc ttc ate 
Phe Phe He 



ccg gac gtt 
Pro Asp Val 



gec gaa 
Ala Glu 
4670 

gag aac 
Glu Asn 
46B5 



ate atg egg cga aca 
He Met Arg Arg Thr 
4675 



egg ttg gtg gcg atg 
Arg Leu Val Ala Met 
4680 



age gcg ttt 
Ser Ala Phe 



cag gcg eta ccg ttc gag cac ctg etc aac gec ctg cac 
Gin Ala Leu Pro Phe Glu His Leu Leu Asn Ala Leu His 
4690 4695 



aag caa cgt gac acc age egg att ccg eta gtt ccg 
Lys Gin Arg Asp Thr Ser Arg He Pro Leu Val Pro 
4700 4705 4710 

cgt cat cag aac ttc ccg gac acg ate ggc gac tgg 
Arg His Gin Asn Phe Pro Asp Thr He Gly Asp Trp 
4715 4720 4725 



gtg gtg atg 
Val Val Met 



age gat ggc 
Ser Asp Gly 



ate cgt acc gaa gtg ate cag 
He Arg Thr Glu Val He Gin 
4730 4735 



cgc gat ctg cgt gee 
Arg Asp Leu Arg Ala 
4740 



acc ccc aat 
Thr Pro Asn 



gac ctg caa ttc ttc ggc gac ggt acg ggg ctt teg gtc 

Blu Met Lp Leu Gls rhc Phe Gly to* Gly Thr Gly Leu tfer JW, 
4745 4750 4755 



gaa atg 



aca gtg gaa tac gcg gcg gag 
Thr Val Glu Tyr Ala Ala Glu 
4760 4765 



ctg ttc tea gaa gcg acc att cgc 
Leu Phe Ser Glu Ala Thr He Arg 
4770 



cac ctg ate cac cat cac caa etc gtc ctg gag cag atg ttg gcg 
SS He His His His Gin Leu Val Leu Glu Gin Met Leu Ala 
4775 4780 4785 



gee cat gaa age gee acg tgc 
Ala His Glu Ser Ala Thr Cys 
4790 4795 



ccc ttg gat gtt gee gac tag 
Pro Leu Asp Val Ala Asp 
4B0O 



13914 



13959 



14004 



14049 



14094 



14139 



14184 



14229 



14274 



14319 



14364 



14406 



<210> 4 

<211> 4801 

<212> PRT 

<213> Xanthomonas albilineans 
<220> 

<221> misc_feature 

<222> (226) . . (240) 

<223> Acyl-CoA ligase subdomain I 

<220> 

<221> misc_feature 
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(486).. (493) 

Acyl-CoA ligase subdomain II 



miBC_f eature 
(897) . . (913) 

Beta-ketoacyl synthase 1 subdomain I 



misc_feature 
(1038) (1047) 

Beta-ketoacyl synthase 1 subdomain II 



misc_feature 
(1080) . . (1089) 

Beta-ketoacyl synthase 1 subdomain III 



misc_f eature 
(2777) (2793) 

Beta-ketoacyl synthase 2 subdomain I 



miscJEeature 
(2918) (2927) 

Beta-ketoacyl synthase 2 subdomain II 



misc_feature 
(2955) . . (2964) 

Beta-ketoacyl synthase 2 subdomain III 



misc feature 
<222r ^ H«12> . . (1942) 

<223> Beta-ketoacyl reductase domain 

<220> 

<22l> mi sc_f eature 

<222> (667) . . (678) 

<223> Acyl carrier protein 1 domain 
<220> 

<221> mi sc_f eature 

<222> (2484) ... (2495) 

<223> Acyl carrier protein 2 domain 

<220> 

<22l> misc_feature 

<222> (2568) .. (2579) 

<223> Acyl carrier protein 3 domain 

<220> 

<221> misc_f eature 

<222> (3806) . . (3811) 

<223> Adenylation subdomain I 

<220> 

<221> miBC_f eature 

<222> (3851) . . (3861) 

<223> Adenylation subdomain II 



<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
<222> 
<223> 

<220> 
<221> 
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misc_feature 
(3917) (3932) 
Adenylation subdomain III 

<220> 

<221> misc_f eature 

<222> (3967) . . (3970) 

<223> Adenylation subdomain IV 

<220> 

<221> miscjE eature 

<222> (4063) . . (4069) 

<223> Adenylation subdomain V 

<220> 

<221> misc_feature 

<222> (4114) . . (4128) 

<223> Adenylation subdomain VI 

<220> 

<221> misc_f eature 

<222> (4152) . . (4157) 

<223> Adenylation subdomain VII 

<220> 

<221> misc_£ eature 

<222> (4170) . . (4189) 

<223> Adenylation subdomain VIII 

<220> 

<221> misc_f eature 

<222> (4239) . . (4245) 

<223> Adenylation subdomain IX 

<220> 

<221> misc_f eature 

<222> (4259) . - (4264) 

<223> Adenylation subdomain X 

<220> 

<221> miBc_feature 

<222> (3261) . . (3271) 

<223> Peptidyl carrier protein 1 domain 
<220> 

<221> misc_f eature 

<222> (4306) ..-(4316) 

<223> Peptidyl carrier protein 2 domain 
<220> 

< 2 2 1 > mi s c_f eature 

<222> (3333) . - (3342) 

<223> Condensation domain 1 subdomain I 
<220> 

<221> miscJEeature 

<222> (3381) . . (3389) 

<223> Condensation domain 1 subdomain II 
<220> 

<221> misc_feature 

<222> (3456) . . (3465) 



<220> 
<221> 
<222> 
<223> 
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<223> Condensation domain 1 subdomain III 



<220> 

<221> misc_feature 

<222> (3495) . . (3501) 

<223> Condensation domain 1 subdomain IV 
<220> 

<221> misc^feature 

<222> (3606) . . (3617) 

<223> Condensation domain 1 subdomain V 



<220> 

<221> misc_feature 

<222> (3641) . . (3647) 

<223> Condensation domain 1 subdomain VI 



<220> 

<221> misc_feature 

<222> (3658) . . (3665) 

<223> Condensation domain 1 subdomain VII 
<220> 

<221> misc_feature 

<222> (4374) (4383) 

<223> Condensation domain 2 subdomain I 



<220> 

<221> misc_feature 

<222> (4421) (4429) 

<223> Condensation domain 2 subdomain II 
<220> 

<22i> misc_feature 

<222> (4498) . . (4507) 

<223>- condensation domain 2 cubccirtain III 



<220> 

<221> misc_feature 

<222> (4538) . . (4544) 

<223> Condensation domain 2 subdomain IV 



c220> 

<2 2 1 > mi sc_f eature 

<222> (4649) . . (4659) 

<223> Condensation domain 2 subdomain V 



<220> 

<221> misc_ feature 

<222> (4685) . . (4691) 

<223> Condensation domain 2 subdomain VI 
<220> 

<221> misc_feature 

<222> (4701) . . (4708) 

<223> Condensation domain 2 subdomain VII 



<400> 4 

Leu Pro Asn Ala Leu Met Gin lie Thr Leu Val Ala Val Gin Phe Ala 
1 5 10 15 

Gly Val Leu Leu Gly Val Thr Ala Arg Ala Ala He Pro Asn Lys Ala 
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20 25 30 

Gly Met Arg Arg Ala Trp Pro Pro Phe Pro Gin Ala Cys Cys Arg Ser 
35 ~ 40 45 



lie Ala Tyr Leu Met Gin Arg Ser Pro Met Ser Pro Leu Gin Gin Thr 
50 55 60 



Leu Leu Thr Arg Leu Ala Ser Ala Ala Ala Ser Arg Thr Met He Glu 
65 70 75 B0 



Phe Pro Arg Pro Glu His Ala Ser Pro Gin Cys Cys Asp Asp Ala Glu 
85 90 95 



Leu Ala Arg Leu He Val Gin Leu Ser Ala Gly Leu Gin Pro Leu Ala 
100 105 HO 



Met Pro Gly Thr Tyr Val He He Ala Ala Pro His Gly Gly Leu Phe 
115 120 125 



Ala Ala Ala Leu Leu Ala Cys Leu His Ala Asn Leu Val Ala Val Pro 
130 135 140 



She Pro Leu Asp Val Ala Gin Pro Asn Glu Arg Glu Gin Ala Arg Leu 
145 150 155 160 



Glu Thr He Kic Ala" Gin Leu Met Gjlu Hie G\y Ae* Val Ale Val hsa 
165 170 175 



Leu Asp Asp Val Ala Asp Arg Ser Ala Phe Ala Arg Met Ala His Ala 
180 185 190 



Ala Gly Thr Phe Leu Ala Thr Phe Ala Asp Leu Lys Arg Glu Ser Thr 
195 200 205 



Ser Ala Ser Leu Cys Pro Ala Ser Pro Ser Asp Ala Ala Leu Leu Leu 
210 215 220 

Phe Thr Ser Gly Ser Ser Gly Glu Ser Lys Gly He Leu Leu Ser His 
225 * 230 235 240 



Arg Asn Leu His His Gin He Gin Ala Gly He Arg Gin Trp Ser Leu 
245 250 255 



Asp Glu His Ser HiB Val Val Thr Trp Leu Ser Pro Ala His Asn Phe 
260 265 270 



Gly Leu His Phe Gly Leu Leu Ala Pro Trp Phe Ser Gly Ala Thr Val 
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275 280 285 



Ser Phe He His Pro His Ser Tyr Met Lys Arg Pro Gly Phe Trp Leu 
290 295 300 



Glu Thr Val Ala Ala Arg Asp Ala Thr His Met Ala Ala Pro Asn Phe 
305 310 315 320 



Ala Phe Asp Tyr Cys Cys Asp Trp Val Met Val Glu Gin Leu Pro Pro 
325 330 335 



Ser Ala Leu Ser Thr Leu Thr His He Val Cys Gly Gly Glu Pro Val 
340 345 350 



Arg Ala Ser Thr Met Gin Arg Phe Phe Glu Lys Phe Ala Gly Leu Gly 
355 360 365 



Ala Arg Thr Gin Thr Phe Met Pro His Phe Gly Leu Ser Glu Thr Gly 
370 375 380 



Ala Leu Ser Thr Leu Asp Glu Ala Pro Gin Gin Arg Val Leu Glu Leu 
385 390 395 400 



Asp Ala Asp Ala Leu Asn Lys Arg Lys Arg Val Ala Ala Gly Ala Ser 
405 410 415 



dlz -M?. ^rg -Val Thr Val he*z Asn • y«a Oiy A1e Val - 1 * ' p Gig: x - : • 
420 425 430 



Glu Leu Arg He Val Cys Pro Glu Gly Glu Thr Leu Cys Arg Pro Asp 
435 440 445 



Glu He Gly Glu He Trp Val Lys Ser Pro Ala He Ala Arg Gly Tyr 
450 455 460 



Leu Phe Ala Lys Pro Ala Asp Gin Arg Gin Phe Asn Cys Ser He Arg 
465 470 475 480 



His Thr Asp Asp Ser Gly Tyr Phe Arg Thr Gly Asp Leu Gly Phe He 
485 490 495 



Ala Asp Gly Cys Leu Tyr Val Thr Gly Arg Val Lys Glu Val Leu He 
500 505 510 



He Arg Gly Lys Asn His Tyr Pro Ala His He Glu Ala Ser He Ala 
515 520 525 
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Ala Thr Ala Ser Pro Gly Ala I*eu Met Pro Val Val Phe Ser lie Glu 
530 535 540 



Arg Gin Asp Glu Glu Arg Val Ala Ala Val He Ala Val Asn His Pro 
545 ~ 550 555 560 



Trp Thr Pro Ala Ala Cys Ala Ala Gin Ala His Lys lie Arg Gin Gin 
565 570 575 



Val Ala Asp Gin His Gly Val Ala Leu Ala Glu Leu Ala Phe Ala Glu 
580 585 590 



His Arg His Val Phe Gly Thr Tyr Pro Gly Lys Leu Lys Arg Arg Leu 
595 600 605 



Val Lys Glu Ala Tyr Val Asn Gly Gin Leu Pro Leu Leu Trp His Glu 
610 615 620 



Gly Lys Asn Arg Asp Val Pro Ala Ala Ala Ala Asp Asp Arg Gin Ala 
625 630 635 640 



Gin His Val Ala Asp Leu Cys Arg Lys Val Phe Leu Pro Val Leu Gly 
645 650 655 



Val Ala Pro Pro His Ala Gin Trp Pro Leu Cys Glu Leu Ala Leu Asp 
660 665 670 



Ser Leu Gin Cys Val Arg Leu Ala Gly Ala lie Glu Glu Cys Tyr Gly 
675 680 685 



Val Pro Phe Glu Pro Thr Leu Leu Phe Lys Leu Glu Thr Val Gly Ala 
690 695 700 



lie Ala Glu Tyr Val Leu Ala His Gly Arg Gin Ala Pro Thr Pro Thr 
705 710 715 720 



Arg Ala Pro Val Ala Ser Thr Thr Cys Ser Glu Glu Pro lie Ala lie 
725 730 735 



Val Ala Met His Cys Glu Val Pro Gly Ala Gly Glu Asn Thr Glu Ala 
740 745 750 



Leu Trp Ser Phe Leu Arg Ser Asp Val Asn Ala lie Arg Pro lie Glu 
755 760 765 



Ser Thr Arg Pro Asp Leu Trp Ala Ala Met Arg Ala Tyr Pro Gly Leu 
770 775 780 
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Ala Gly Glu Gin Leu Pro Arg Tyr Ala Gly Phe Leu Asp Asp Val Asp 
785 790 795 800 



Ala Phe Asp Ala Ala Phe Phe Gly He Ser Arg Arg Glu Ala Glu Cys 
805 810 815 



Met Asp Pro Gin Gin Arg Lys Val Leu Glu Met Val Trp Lys Leu He 
820 825 830 



Glu Gin Ala Gly His Asp Pro Leu Ser Trp Gly Gly Gin Pro Val Gly 
835 840 845 



Leu Phe Val Gly Ala His Thr Ser Asp Tyr Gly Glu Leu Leu Ala Ser 
850 855 860 



Gin Pro Gin Leu Met Ala Gin Cys Gly Ala Tyr He Asp Ser Gly Ser 

865 870 875 f yj. t 880 



His Leu Thr Met He Pro Asn Arg Ala Ser Arg Trp Phe Asn Phe Thr 
885 890 895 



Gly Pro Ser Glu Val He Asn Ser Ala Cys Ser Ser Ser Leu Val Ala 
900 905 910 



Leu His Arg Ala Val Gin Ser Leu Arg Gin Gly Glu Ser Ser Val Ala 
915 920 975 



Leu Val Leu Gly Val Asn Leu He Leu Ala Pro Lys Val Leu Leu Ala 
930 935 940 



Ser Ala Ser Ala Gly Met Leu Ser Pro Asp Gly Arg Cys Lys Thr Leu 

945 950 955 960 

Asp Ala Ala Ala Asp Gly Phe Val Arg Ser Glu Gly He Ala Gly Val 

965 970 975 



He Leu Lys Pro Leu Ala Gin Ala Leu Ala Asp Gly Asp Arg Val Tyr 
980 985 990 



Gly Leu Val Arg Gly Val Ala Val Asn HiB Gly Gly Arg Ser Asn Ser 
995 1000 1005 



Leu Arg Ala Pro Asn Val Asn Ala Gin Arg Gin Leu Leu He Arg 
1010 1015 1020 



Thr Tyr Gin Glu Ala Gly Val Glu Pro Ala Ser Val Gly Tyr Val 
1025 1030 1035 
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Glu Leu His Gly Thr Gly Thr Ser Leu Gly Asp Pro He Glu He 
1040 1045 1050 

Gin Ala Leu Lys Glu Ala Phe lie Ala Leu Gly Ala Gin Ala Ala 
1055 1060 1065 

Pro Ser Asn Cys Gly He Gly Ser Val Lys Ser Ala Leu Gly His 
1070 1075 1080 

Leu Glu Ala Ala Ala Gly Leu Thr Gly Leu He Lys Val Leu Leu 
1085 1090 1095 

Met Leu Lys His Gly Glu Gin Ala Gly Thr Arg His Phe Ser Thr 
1100 H05 1H0 

Leu Asn Pro Leu He Asp Leu Arg Gly Thr Ser Phe Glu Val Val 
1115 H20 H25 

Ala Gin His Arg Ala Trp Pro Ser Gin Val Gly He His Gly Thr 
1130 H35 H40 

Leu Leu Pro Arg Arg Ala Gly He Ser Ser Phe Gly Phe Gly Gly 
1145 1150 H55 

Ala Asn Ala His Ala He Val Glu Glu His Val He Ala Thr Pro 
1160 H65 H70 



Pro Ser Thr Ser Ser Ala Gly Gly Pro val Gly He Val Leu Ser 
1175 1180 H85 



Ala Gly Ser Glu Ala Val Leu Arg Gin Gin Val Leu Ala Leu Ser 
1190 1195 1200 

Ala Trp Leu Arg Gin Gin Ser Pro Thr Pro Ala Gin Met He Asp 
1205 1210 1215 



Val Ala Tyr Thr Leu Gin Val Gly Arg Ala Ala Leu Ser His Arg 
1220 1225 1230 



Leu Ala Phe Ser Ala Thr Asp Ala Glu Gin Ala Leu Ala Arg Leu 
1235 1240 1245 

Glu Gly Arg Leu Ala Gly Val Met Asp Ala Glu Val His His Gly 
1250 1255 1260 

Val Val Asp Ala Ala Ala Thr Ala Pro Glu His Gly Arg Gin Thr 
1265 1270 1275 
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Arg Glu Gly Leu Ala Gly Leu Leu Arg Ala Trp Thr Gin Gly Val 
1280 1285 1290 

Arg Val Asp Trp Ser Ala Leu Tyr Gly lie Gin Arg Pro Gin Arg 
1295 1300 1305 

Val Ser Leu Pro Val Tyr Pro Phe Ala Arg Glu Arg Tyr Trp Leu 
1310 1315 1320 



Pro Gly Gin Ala Met His Ala Ala Ala Asp Ala His Pro Met Leu 
1325 1330 1335 



Gin Leu Leu His Ala Asn Ala Lys Leu His Arg Tyr Ala Leu Arg 
1340 1345 1350 

Arg Ser Gly Cys Ala Ser Phe Leu Val Asp His Cys Val Asp Gly 
1355 1360 1365 



Arg Gin Val Leu Pro Ala Ala Val Gin Leu Glu Leu Val Arg Ala 
1370 1375 1380 



Val Ala Gin Arg Val Met Ala Gin Asp Glu Gly Cys lie Glu Leu 
1385 1390 1395 



Ala Gin Val Ala Phe Leu His Pro Leu Met Met Glu Glu Thr Glu 

^00 U'J± 1410 



Leu Glu Val Glu He Glu Leu Ser Lys Ser Asp Gin Asp Glu Phe 
1415 1420 1425 



Asp Phe Gin Leu His Asp Ala His Arg Gin Gin Val Phe Ser Gin 
1430 1435 1440 



Gly His Val Arg Arg Arg Val Tyr Thr Ala Thr Pro Arg Leu Asp 
1445 1450 1455 



Leu Ala Gin Leu Gin Lys Leu Cys Ala Glu Arg Val Leu Ser Gly 
1460 1465 1470 



Glu Asp Cys Tyr Ala His Phe Thr Ala Cys Gly Leu Gin Leu Gly 
1475 1480 1485 



Asp Arg Leu Lys Ser Val Gin Ser He Gly Cys Gly Arg Asn Gly 
1490 1495 1500 



Glu Gly Glu Pro He Ala Leu Gly Val Leu Arg Leu Pro Pro Ser 
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1505 1510 1515 



Ser Val Glu Asp Ser His Val Leu Pro Pro Ser Leu Leu Asp Gly 
1520 1525 1530 



Ala Leu Gin Cys Ser Leu Gly Leu Gin Arg Asp Val Glu His He 
1535 1540 1545 



Ala Met Pro Tyr Thr Leu Glu Arg Met Thr Val His Ala Pro He 
1550 1555 1560 



Pro Pro Glu Ala Trp Val Leu Leu Arg His Gly His Ala Ala Arg 
1565 1570 1575 



Gin Ser Leu Asp He Asp Leu Leu Asp Ser Glu Gly Arg Val Cys 
1580 1585 1590 



Val Ser Leu Gly Asn Tyr Thr Gly Arg Ala Pro Lys Ala Val Ser 
1595 1600 1605 



Ala Val Arg Ala Leu Val Leu Ala Pro Val Trp Gin Ala Leu Thr 
1610 1615 1620 



Glu Thr Ala Pro Ala Trp Pro Asp Pro Ala Glu Arg He Val Thr 
1625 1630 1635 



Val Cly Asp Asp Trp Arg Ser Hie tfhc Gly Lhe Asp Glv. Pr 
1640 1645 1650 

Ala Leu Ser Leu Glu Asp Ser Val Glu Val He Ala Thr Arg Leu 
1655 1660 1665 



Gly Gin Ser Gly Lys Phe Asp His Leu Val Trp He Val Pro He 
1670 1675 1680 



Ala Glu Ser Glu Thr Asp He Ala Ala Gin Gly Ser Ala Ala He 
1685 1690 1695 



Ala Gly Phe Arg Leu Val Lys Ala Leu Leu Ala Leu Gly Tyr Ala 
1700 1705 1710 



His Arg Pro Leu Gly Leu Thr Val Leu Thr Arg Gin Ala Leu Thr 
1715 1720 1725 



Arg Gin Pro Ser His Ala Ala Val His Gly Leu He Gly Thr Leu 
1730 1735 1740 



Ala Lys Glu Tyr Cys Asn Trp Lys He Arg Leu Leu Asp Leu Pro 
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1745 1750 1755 



Ser Val Lys Ser Trp Pro Gin Trp Glu Gin Leu Arg Ser Leu Pro 
1760 1765 1770 



Trp His Ala Gin Gly Glu Ala Leu He Gly Arg Gly Thr Cys Trp 
1775 1780 1785 



Tyr Arg Arg Gin Leu Cys Glu Val Leu Pro Leu Pro Ser Leu Glu 
1790 1795 1800 



Pro Pro Pro Tyr Arg Val Gly Gly Val Tyr Val Val He Gly Gly 
1805 1810 1815 



Ala Gly Gly Leu Gly Glu Val Leu Ser Glu His Leu He Arg Thr 
1820 1825 1830 



Tyr Asp Ala Gin Leu He Trp He Gly Arg Arg Val Leu Asp Glu 
1835 1840 1845 



Gly He Ala Arg Lys Gin Thr Arg Leu Ala Ser Leu Gly Arg Ala 
1850 1855 I860 



Pro His Tyr He Ser Ala Asp Ala Ser Asp Pro Ala Ala Leu Gin 
. 1B65 1870 1875 



La Ala His Ac:: -Glu lie Val - Ms' Leu Hie Gly Gin »rc ?*i£? Oly 
1880 1885 1890 



Leu He Leu Ser Asn He Val Leu Lys Asp Ala Ser Leu Ala Arg 
1895 1900 1905 



Met Glu Glu Ala Asp Phe Arg Asp Val Leu Ala Ala Lys Leu Asp 
1910 1915 1920 



Val Ser Val CyB Ala Ala Gin Val Phe Gly Thr Ala Pro Leu Asp 
1925 1930 1935 



Phe Val Leu Phe Phe Ser Ser He Gin Ser Thr Thr Lys Ala Ala 
1940 1945 1950 



Gly Gin Gly Asn Tyr Ala Ala Gly Cys Cys Tyr Val Asp Ala Phe 
1955 I960 1965 



Gly Glu Leu Trp Ala Arg Arg Gly Leu Arg Val Lys Thr He Asn 
1970 1975 1980 
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Trp Gly Tyr Trp Gly Ser Val Gly Val Val Ala Gly Glu Asp Tyr 
1985 1990 1995 

Arg Arg Arg Met Ala Gin Lys His Met Ala Ser lie Glu Gly Ala 
2000 2005 2010 

Glu Ala Met Gin Val Leu Ser Gin Leu Leu Cys Ala Pro Leu Gin 
2015 2020 2025 

Arg Leu Ala Tyr Val Lys He Asp Asp Ala Asn Ala Met Arg Ala 
2030 2035 2040 

Leu Gly Val Val Glu Asp Glu Ser Val Gin He Pro Val His Ala 
2045 2050 2055 



Pro Ala Glu Pro Pro Arg Gly Gin Pro Gly Pro Val Val Glu Leu 
2060 2065 2070 



Ser Val Asn Leu Asp Ala Arg Arg Glu Arg Glu Thr Leu Leu Ala 
2075 2080 2085 



Ala Trp Leu Leu Glu Leu He Glu Gin Leu Gly Gly Phe Pro Pro 
2090 2095 2100 



Ala Ser Phe Asp He Ala Thr Leu Ala Gin Arg Leu His He Val 
2105 2110 2115 

Pro Ala Tyr Arg Ser Trp Leu Glu His Ser Val Arg Met Leu Gly 
2120 2125 2130 



Val Tyr Gly Tyr Leu Arg Ala Thr Gly Glu Ser Arg Phe Glu Leu 
2135 2140 2145 



Ala Asp Lys Pro Pro Asp Asp Ala Arg Gly Ala Trp Asn Ala His 
2150 2155 2160 



Val His Glu Ala Ser Val Glu Ala Gly Glu Glu Ala Gin Arg Arg 
2165 2170 2175 



Leu Leu Asp Arg Cys Met Arg Ala Leu Pro Ala Val Leu Arg Gly 
2180 2185 2190 



Glu Arg Lys Ala Thr Glu Leu Leu Phe Pro Glu Gly Ser Met Ala 
2195 2200 2205 



Trp Val Glu Gly lie Tyr Gin Asn Asn Pro Leu Ala Asp Tyr Phe 
2210 2215 2220 
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Asn Ala Gin Leu Val Thr Arg Leu He Ala Tyr Leu Arg Arg Arg 
2225 2230 2235 



Leu Glu Ser Thr Pro Thr Ala Arg Leu Lys Leu Cys Glu He Gly 
2240 2245 2250 

Ala Gly Ser Gly Gly Thr Thr Ala Ser Val Leu Gin Gin Leu Gin 
2255 2260 2265 



Ala Tyr Gly Glu His He Glu Glu Tyr Leu Tyr Thr Asp Leu Ser 
2270 2275 2280 

Pro Val Phe Leu His His Ala Glu Lys His Tyr Gin Pro Arg Ala 
2285 2290 2295 



Pro Tyr Leu Arg Thr Ala Cys Phe Asp Val Ala Arg Ala Pro Thr 
2300 2305 2310 



Ala Gin Ala Leu Glu Ser Gly Gly Tyr Asp Val Val He Ala Ala 
2315 2320 2325 



Asn Val Leu His Ala Thr Arg Asp He Ala Lys Thr Leu Arg Asn 
2330 2335 2340 

Ala Lys Ala Leu Leu Lys Pro Gly Gly Leu Leu Leu Leu Asn Glu 
2345 2350 2355 



Val He Glu Arg Ser Leu Val Leu His Leu Thr Phe Gly Leu Leu 
2360 2365 2370 



Glu Ser Trp Trp Leu Pro Gin Asp Lys He Leu Arg Leu Ala Gly 
2375 2380 2385 

Ser Pro Leu Leu Ala Cys Ala Thr Trp Arg Ser Leu Leu Glu Ala 
2390 2395 2400 



Glu Gly Phe Ala Gly Leu Ser Val His Arg Ala Gin Pro Asp Ala 
2405 2410 2415 



Gly Gin Ala He He Cys Ala Tyr Ser Asp Gly He Val Arg Gin 
2420 2425 2430 



Ala Ser Thr He Glu Val Ala Arg Asn Glu Lys Val Thr Val Pro 
2435 2440 2445 

Ser Gin Pro Ala Glu Ala Gly Glu Ser Pro Leu Asp Leu Val Lys 
2450 2455 2460 
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Lys Leu Leu Gly Arg lie Leu Lys Met Asp Pro Ala Thr Leu Asp 
2465 2470 2475 



Thr Ser His Pro Leu Glu Tyr Tyr Gly Val Asp Ser lie Val Ala 
2480 2485 2490 



lie Glu Leu Ala Met Ala Leu Arg Glu Thr Phe Pro Gly Phe Glu 
2495 2500 2505 



Val Ser Glu Leu Phe Glu Thr Gin Ser lie Asp Thr Leu Leu Gly 
2510 2515 2520 



Ser Leu Glu Gin Ala Pro Leu Leu Ala Thr Leu Thr Ala Pro Pro 
2525 2530 2535 



Gin Gin Asp Met Leu Gin Gin Leu Lys Gin Leu Leu Ala Arg Thr 
2540 2545 2550 



Leu Lys Leu Asp lie Thr Gin lie Asp Thr Ser Lys Thr Leu Glu 
2555 2560 2565 



Ser Tyr Gly Val Asp Ser lie Val lie lie Glu Leu Ala Asn Ala 
2570 2575 2580 



Leu Arg Glu Arg Tyr Pro Ser Leu Asp Ala Ser Gin Leu Met Glu 
2585 2590 2595 



Thr Leu Ser lie Asp Arg Leu Val Ala Gin Trp Gin Ala Thr Glu 
2600 2605 2610 



Pro Ala Val Pro Ala Glu Pro Thr Ala Glu Pro Pro Val Ala Asp 
2615 2620 2625 



Glu Asp Ala Ala Ala He He Gly Leu Ala Gly Arg Phe Pro Gly 
2630 2635 2640 



Ala Asp Thr Leu Glu Glu Phe Trp Asn Asn Leu Arg Asn Gly Gin 
2645 2650 2655 



Ser Ser Met Gly Glu Val Pro Gly Glu Arg Trp Asp His Gin His 
2660 2665 2670 



Tyr Phe Asp Ser Glu Arg Gin Ala Pro Gly Lys Thr Tyr Ser Arg 
2675 2680 2685 



Trp Gly Ala Phe Leu Arg Asp He Asp Gly Phe Asp Ala Ala Phe 
2690 2695 2700 
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Phe Glu Trp Pro Asp Ser Val Ala Leu Glu Ser Asp Pro Gin Ala 
2705 2710 2715 



Arg lie Phe Leu Glu Gin Ala Tyr Ala Gly lie Glu Asp Ala Gly 
2720 2725 2730 



Tyr Thr Pro Gly Ser Leu Ser Lys Ser Gin Arg Val Gly Val Phe 
2735 2740 2745 



Val Gly Val Met Asn Gly Tyr Tyr Ser Gly Gly Ala Arg Phe Trp 
2750 2755 2760 



Gin He Ala Asn Arg Val Ser Tyr Gin Phe Asp Phe Arg Gly Pro 
2765 2770 2775 



Ser Leu Ala Val Asp Thr Ala Cys Ser Ala Ser Leu Thr Ala He 
2780 2785 2790 



His Leu Ala Leu Glu Ser Leu Arg Ser Gly Ser Cys Glu Val Ala 
2795 2800 2805 



Leu Ala Gly Gly Val Asn Leu Leu Val Asp Pro Gin Gin Tyr Leu 
2810 2815 2820 



Asn Leu Ala Gly Ala Ala Met Leu Ser Ala Gly Ala Ser Cys Arg 

2B25: ?P3^ 



Pro Phe Gly Glu Ala Ala Asp Gly Phe Val Ala Gly Glu Ala Cys 
2840 2845 2850 



Gly Val Val Leu Leu Lys Pro Leu Lys Gin Ala Arg Ala Asp Gly 
2855 2860 2865 



Asp Val He His Ala Val He Arg Gly Ser Met He Asn Ala Gly 
2870 2875 2880 



Gly His Thr Ser Ala Phe Ser Ser Pro Asn Pro Ala Ala Gin Ala 
2885 2890 2895 



Glu Val Val Arg Gin Ala Leu Gin Arg Ala Gly Val Ala Pro Asp 
2900 2905 2910 



Ser He Ser Tyr He Glu Ala His Gly Thr Gly Thr Val Leu Gly 
2915 2920 2925 



Asp Ala Val Glu Leu Gly Ala Leu Asn Lys Val Phe Asp Lys Arg 
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2930 2935 2940 



Ala Ala Pro Cys Pro He Gly Ser Leu Lys Ala Asn He Gly His 
2945 2950 2955 



Ala Glu Ser Ala Ala . Gly He Ala Gly Leu Ala Lys Leu Val Leu 
2960 2965 2970 



Gin Phe Arg His Gly Glu Leu Val Pro Ser Leu Asn Ala Phe Pro 
2975 2980 2985 



Leu Asn Pro Tyr He Glu Phe Gly Arg Phe Gin Val Gin Gin Gin 
2990 2995 3000 



Pro Ala Pro Trp Pro Arg Arg Gly Ala Gin Pro Arg Arg Ala Gly 
3005 3010 3015 



Leu Ser Ala Phe Gly Ala Gly Gly Ser Asn Ala His Leu Val Val 
3020 3025 3030 

Glu Glu Ala Pro Ala Met Ala Pro Gly Val Ser He Ser Ala Ser 
3035 3040 3045 



Ser Pro Ala Leu He Val Leu Ser Ala Arg Thr Leu Pro Ala Leu 
3050 3055 3060 



Gin Gin Arg Ala >rg Asp Leu Leu Val Trp Met Gin Ala Arg Gin 
3065 3070 3fc75 



Val Asp Asp Val Met Leu Ala Asp Val Ala Tyr Thr Leu His Leu 
3080 3085 3090 



Gly Arg Val Ala Met Glu Gin Arg Leu Ala Phe Thr Ala Gly Ser 
3095 3100 3105 



Ala Ala Glu Leu Ser Glu Lys Leu Gin Ala Tyr Leu Gly His Ala 
3110 3115 3120 



He Arg Ala Asp He Tyr Leu Ser Glu Asp Thr Pro Gly Lys Pro 
3125 3130 3135 



Ala Gly Ala Pro He Val Ala Glu Glu Asp Leu Leu Thr Leu Met 
3140 3145 3150 



Asp Ala Trp He Glu Lys Gly Gin Tyr Gly Arg Leu Leu Glu Tyr 
3155 3160 3165 



Trp Thr Lys Gly Gin Pro He Asp Trp Asn Lys Leu Tyr Trp Arg 
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Lys Leu Tyr Ala Asp Gly Arg Pro Arg Arg He Ser Leu Pro Thr 
3185 3190 3195 



Tyr Pro Phe Glu His Arg Arg Tyr Trp Gin Thr Pro Val Pro Gly 
3200 3205 3210 



Glu Arg Ser Leu His Ala Thr Ala Pro Ala Thr Arg Glu Thr Val 
3215 3220 3225 



Ala Val Gly Ala Met Pro Asp Pro Ala Gly Ala Thr Val Gin Ala 
3230 3235 3240 



Arg Leu Cys Ala Leu Cys Gin Val Leu Leu Gly Lys Pro Val Thr 
3245 3250 3255 

.' . .) .'. : .- . 

Ala Gin Met Asp Phe Phe Ala Val Gly Gly His Ser Val Leu Ala 
3260 3265 3270 



He Gin Leu Val Ser Arg lie Arg Lys Ser Phe Gly Val Glu Tyr 
3275 3280 3285 



Pro Val Ser Ala Leu Phe Glu Ser Ala Leu Leu Ser Asp Met Ala 
3290 3295 3300 



Arg Gin lie Glu Gin Leu Arg Val Asn Gly Val Ale Lyc Arg Met 
3305 3310 3315 



Pro Ala Leu Leu Pro Ala Gly Arg Val Gly Ala He Pro Ala Thr 
3320 3325 3330 



Tyr Ala Gin Glu Arg Leu Trp Leu Val His Glu His Met Ser Glu 
3335 3340 3345 



Gin Arg Ser Ser Tyr Asn He Thr Phe Ala Met His Phe Arg Gly 
3350 3355 3360 



Val Asp Phe Arg Ala Glu Ala Met Arg Ala Ala Leu Asn Ala Leu 
3365 3370 3375 



Val Val Arg His Glu Val Leu Arg Thr Arg Phe Leu Ser Glu Asp 
3380 3385 3390 



Gly Gin Leu Gin Gin Val He Ala Ala Ser Leu Thr Leu Glu Val 
3395 3400 3405 
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Pro Val Arg Glu Met Ser Val Glu Glu Val Asp Leu Leu Leu Ala 
3410 34i5 3420 



Ala Ser Thr Arg Glu Thr Phe Asp Leu Arg Gin Gly Pro Leu Phe 
3425 3430 3435 



Lys Ala Arg He Leu Arg Val Ala Ala Asp His His Val Val Leu 
3440 3445 3450 



Ser Ser He His His He He Ser Asp Gly Trp Ser Leu Gly Val 
3455 3460 3465 



Phe Asn Arg Asp Leu His Gin Leu Tyr Glu Ala Cys Leu Arg Gly 
3470 3475 3480 



Thr Pro Pro Thr Leu Pro Thr Leu Ala Val Gin Tyr Ala Asp Tyr 
3485 3490 3495 



Ala Leu Trp Gin Arg Gin Trp Glu Leu Ala Ala Pro Leu Ser Tyr 
3500 3505 3510 



Trp Thr Arg Ala Leu Glu Gly Tyr Asp Asp Gly Leu Asp Leu Pro 
3515 3520 3525 



Tyr Asp Arg Pro Arg Gly Ala Thr Arg Ala Trp Arg Ala Gly Leu 
3530 3535 3540 



Val Lys His Arg Tyr Pro Pro Gin Leu Ala Gin Gin Leu Ala Ala 
3545 3550 3555 



Tyr Ser Gin Gin Tyr Gin Ala Thr Leu Phe Met Ser Leu Leu Ala 
3560 3565 3570 



Gly Leu Ala Leu Val Leu Gly Arg Tyr Ala Asp Arg Lys Asp Val 
3575 3580 3585 



Cys He Gly Ala Thr Val Ser Gly Arg Asp Gin Leu Glu Leu Glu 
3590 3595 3600 



Glu Leu He Gly Phe Phe He Asn He Leu Pro Leu Arg Val Asp 
3605 3610 3615 



Leu Ser Gly Asp Pro Cys Leu Glu Glu Val Leu Leu Arg Thr Arg 
3620 3625 3630 



Gin Val Val Leu Asp Gly Phe Ala His Gin Ser Val Pro Phe Glu 
3635 3640 3645 
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His Val Leu Gin Ala Leu Arg Arg Gin Arg Asp Ser Ser Gin He 
3650 3655 3660 



Pro Leu Val Pro Val Met Leu Arg His Gin Asn Phe Pro Thr Gin 
3665 3670 3675 

Glu He Gly Asp Trp Pro Glu Gly Val Arg Leu Thr Gin Met Glu 
3680 3685 3690 



Leu Gly Leu Asp Arg Ser Thr Pro Ser Glu Leu Asp Trp Gin Phe 
3695 3700 3705 



Tyr Gly Asp Gly Ser Ser Leu Glu Leu Thr Leu Glu Tyr Ala Gin 
3710 3715 3720 

Asp Leu Phe Asp Glu Ala Thr Val Arg Arg Met He Ala His His 
3725 3730 3735 



Gin Gin Ala Leu Glu Ala Met Val Ser Arg Pro Gin Leu Arg Val 
3740 3745 3750 



Gly Lys Trp Asp Met Leu Thr Ala Glu Glu Arg Arg Leu Phe Ala 
3755 3760 3765 



Ala Leu Asn Ala Thr Gly Thr Pro Arg Glu Trp Pro Ser Leu Ala 
3770 3775 3780 



Gin Gin Phe Glu Arg Gin Ala Gin Ala Thr Pro Gin Ala He Ala 
3785 3790 3795 



Cys Val Ser Asp Gly Gin Ser Trp Ser Tyr Ala Gin Leu Glu Ala 
3800 3805 3810 



Arg Ala Asn Gin Leu Ala Gin Ala Leu Arg Gly Gin Gly Ala Gly 
3815 3820 3825 



Arg Asp Val Arg Val Ala Val Gin Ser Ala Arg Thr Pro Glu Leu 
3830 3835 3840 



Leu Met Ala Leu Leu Ala He Phe Lys Ala Gly Ala Cys Tyr Val 
3845 3850 3855 



Pro He Asp Pro Ala Tyr Pro Ala Ala Tyr Arg Glu Gin He Leu 
3860 3865 3870 

Ala Glu Val Gin Val Ser He Val Leu Glu Gin Asp Glu Leu Ala 
3875 3880 3885 
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Leu Asp Glu Gin Gly Gin Phe His Asn Pro Arg Trp Arg Glu Gin 
3890 3895 3900 

Ala Pro Thr Pro Leu Gly Leu Arg Glu His Pro Gly Asp Leu Ala 
3905 3910 3915 



Cys Val Met Val Thr Ser Gly Ser Thr Gly Arg Pro Lys Gly Val 
3920 3925 3930 



Met Val Pro Tyr Ala Gin Leu His Asn Trp Leu His Ala Gly Trp 
3935 3940 3945 



Gin Arg Ser Ala Phe Glu Ala Gly Glu Arg Val Leu Gin Lys Thr 
3950 3955 3960 



Ser lie Ala Phe Ala Val Ser Val Lys Glu Leu Leu Ser Gly Leu 
3965 3970 3975 



Leu Ala Gly Val Glu Gin Val Met Leu Pro Asp Glu Gin Val Lys 
3980 3985 3990 



Asp Ser Leu Ala Leu Ala Arg Ala lie Glu Gin Trp Gin Val Thr 
3995 4000 4005 



Arg Leu Tyr Leu Val Pro Ser His Leu Gin Ala Leu Leu Asp Ala 
4010 4015 4020 

Thr Gin Gly Arg Asp Gly Leu Leu His Ser Leu Arg His Val Val 
4025 4030 4035 



Thr Ala Gly Glu Ala Leu Pro Ser Ala Val Arg Glu Thr Val Arg 
4040 4045 4050 



Val Arg Leu Pro Gin Val Gin Leu Trp Asn Asn Tyr Gly Cys Thr 
4055 4060 4065 



Glu Leu Asn Asp Ala Thr Tyr His Arg Ser Asp Thr Val Ala Pro 
4070 4075 4080 



Gly Thr Phe Val Pro He Gly Ala Pro He Ala Asn Thr Glu Val 
4085 4090 4095 



Tyr Val Leu Asp Arg Gin Leu Arg Gin Val Pro He Gly Val Met 
4100 4105 4110 

Gly Glu Leu His Val His Ser Val Gly Met Ala Arg Gly Tyr Trp 
4115 4120 4125 
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Asn Arg Pro Gly Leu Thr Ala Ser Arg Phe lie Ala His Pro Tyr 
4130 4135 4140 



Ser Glu Glu Pro Gly Thr Arg Leu Tyr Lys Thr Gly Asp Met Val 
4145 4150 4155 



Arg Arg Leu Ala Asp Gly Thr Leu Glu Tyr Leu Gly Arg Gin Asp 
4160 4165 4170 



Phe Glu Val Lys Val Arg Gly His Arg Val Asp Thr Arg Gin Val 
4175 4180 4185 



Glu Ala Ala Leu Arg Ala Gin Pro Ala Val Ala Glu Ala Val Val 
4190 4195 4200 



Ser Gly His Arg Val Asp Gly Asp Met Gin Leu Val Ala Tyr Val 
4205 4210 4215 



Val Ala Arg Glu Gly Gin Ala Pro Ser Ala Gly Glu Leu Lys Gin 
4220 4225 4230 



Gin Leu Ser Ala Gin Leu Pro Thr Tyr Met Leu Pro Thr Val Tyr 
4235 4240 4245 



Gin Trp Leu Glu Gin Leu Pro Arg Leu Ser Asn Gly Lys Leu Asp 

£250 4255 4260 



Arg Leu Ala Leu Pro Ala Pro Gin Val Val His Ala Gin Glu Tyr 
4265 4270 4275 



Val Ala Pro Arg Asn Glu Ala Glu Gin Arg Leu Ala Ala Leu Phe 
4280 4285 4290 



Ala Glu Val Leu Arg Val Glu Gin Val Gly He His Asp Asn Phe 
4295 4300 4305 



Phe Ala . Leu Gly Gly His Ser Leu Ser Ala Ser Gin Leu He Ser 
4310 4315 4320 



Arg He Arg Gin Ser Phe His Val Asp Leu Pro Leu Ser Arg He 
4325 4330 4335 



Phe Glu Ala Pro Thr He Glu Gly Leu Val Arg Gin Leu Ala Leu 
4340 4345 4350 



Pro Ser Glu Gly Gly Val Ala Ser He Ala Arg Val Ala Arg Asn 
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4355 4360 4365 



Arg Thr He Pro Leu Ser Leu Phe Gin Glu Arg Leu Trp Phe Val 
4370 4375 4380 



His Gin His Met Pro Glu Gin Arg Thr Ser Tyr Asn Gly Thr Leu 
4385 4390 4395 



Ala Leu Arg Leu Arg Gly Pro Leu Ser Val Glu Ala Met Arg Ala 
4400 4405 4410 

Ala Leu Arg Ala Leu Val Leu Arg His Glu He Leu Arg Thr Arg 
4415 4420 4425 



Phe Val Leu Pro Thr Gly Ala Ser Glu Pro Val Gin Val He Asp 
4430 4435 4440 



Glu His Ser Asp Phe Gin Leu Ser Val Gin Leu Val Glu Asp Thr 
4445 4450 4455 



Glu He Ala Ser Leu Met Asp Glu Leu Ala Ser His He Tyr Asp 
4460 4465 4470 



Leu Ala Asn Gly Pro Leu Phe He Ala Cys Leu Leu Gin Leu Asp 
4475 4480 4485 



Glu Gin Glu His Val leu Leu He Gly Met His His Leu He Tyr 
449C 4495 4500 



Asp Ala Trp Ser Gin Phe Thr Val Met Asn Arg Asp Leu Arg Val 
4505 4510 4515 



Leu Tyr His Arg His Leu Gly Leu Ala Gly Gly Asp Leu Pro Glu 
4520 4525 4530 



Leu Pro He Gin Tyr Ala Asp Tyr Ala He Trp Gin Arg Ala Gin 
4535 4540 4545 



Asn Leu Asp Ala Gin Leu Ala Tyr Trp Gin Ala Met Leu His Asp 
4550 4555 4560 



Tyr Asp Asp Gly Leu Glu Leu Pro Tyr Asp Tyr Pro Arg Pro Arg 
4565 4570 4575 



Asn Arg Thr Trp His Ala Ala Val Tyr Thr His Thr Tyr Pro Ala 
4580 4585 4590 



Glu Leu Val Gin Arg Phe Ala Gly Phe Val Gin Ala His Gin Ser 
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4595 4600 4605 



Thr Leu Phe He Gly Leu Leu Ala Ser Phe Ala Val Val Leu Asn 
4610 4615 4620 



Lys Tyr Thr Gly Arg Asp Asp Leu Cys He Gly Thr Thr Thr Ala 
4625 4630 4635 



Gly Arg Thr His Leu Glu Leu Glu Asn Leu He Gly Phe Phe He 
4640 4645 4650 



Asn He Leu Pro Leu Arg Leu Arg Leu Asp Gly Asp Pro Asp Val 
4655 4660 4665 



Ala Glu He Met Arg Arg Thr Arg Leu Val Ala Met Ser Ala Phe 
4670 4675 4680 



Glu Asn Gin Ala Leu Pro Phe Glu His Leu Leu Asn Ala Leu His 
4685 4690 4695 



Lys Gin Arg Asp Thr Ser Arg He Pro Leu Val Pro Val Val Met 
4700 4705 4710 



Arg His Gin Asn Phe Pro Asp Thr He Gly Asp Trp Ser Asp Gly 
4715 4720 4725 



He Arg Thr Giu Val He Gin «xg Asp Leu Jsrg Ala £hr Pro At: 
4730 4735 4740 



Glu Met Asp Leu Gin Phe Phe Gly Asp Gly Thr Gly Leu Ser Val 
4745 4750 4755 



Thr Val Glu Tyr Ala Ala Glu Leu Phe Ser Glu Ala Thr He Arg 
4760 4765 4770 



Arg Leu He His His His Gin Leu Val Leu Glu Gin Met Leu Ala 
4775 4780 4785 



Ala His Glu Ser Ala Thr Cys Pro Leu Asp Val Ala Asp 
4790 4795 4800 



<210> 5 

<211> 45 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1) . . (45) 



-89- 
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<223> Acyl-CoA ligase Bubdomain I 



<400> 5 

acc tct ggt tec teg ggt gag tec aag ggc ate ctg ctt age cac 4! 
Thr Ser Gly Ser Ser Gly Glu Ser Lye Gly lie Leu Leu Ser His 
15 10 15 



<210> 6 
<211> 15 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 6 

Thr Ser Gly Ser Ser Gly Glu Ser Lys Gly lie Leu Leu Ser His 
15 10 15 



<210> 


7 


<211> 


24 


<212> 


DNA 


<213> 


Xanthomonas albilineans 


<220> 




<221> 


CDS 


<222> 


(1) (24) 


<223> 


Acyl-CoA ligase subdomain 



<400> 7 

ggt tac ttt cgt 

Gly Tyr Phe Arg 

1 



acc ggc gac ctg 
Thr Gly Asp Leu 
5 



<210> 8 

<211> 8 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 8 

Gly Tyr Phe Arg Thr Gly Asp Leu 
1 5 



<210> 9 

<211> 51 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1) . . (51) 

<223> Beta-ketoacyl synthase 1 subdomain I 



<400> 9 

ggc ccc age gaa gta ate aac age get tgc tec age teg ctg gtg gcg 

Gly Pro Ser Glu Val lie Asn Ser Ala Cys Ser Ser Ser Leu val Ala 
15 10 15 
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ctg 
Leu 



<210> 10 

<211> 17 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 10 



Gly Pro Ser Glu Val lie Asn Ser Ala Cys Ser Ser Ser Leu Val Ala 
l5 10 15 



Leu 



<210> 11 

<211> 30 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1)..(30) 

<223> Beta-ketoacyl synthase 1 subdomain II 



<400> 11 

gtt gaa eta cac ggc act ggt acc age ctg 
Val Glu Leu His Gly Thr Gly Thr Ser Leu 

1 b 10 



<210> 12 
<211> 10 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 12 

Val Glu Leu His Gly Thr Gly Thr Ser Leu 
15 10 



<210> 13 

<211> 30 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1)..(30) 

<223> Beta-ketoacyl synthase 1 subdomain III 



<400> 13 

gcg ctg ggc cat eta gaa gee get gca ggc 
Ala Leu Gly His Leu Glu Ala Ala Ala Gly 
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15 10 



<210> 14 

<211> 10 

<212> PRT 

<213> Xanthomonas albilineans 



<400> 14 

Ala Leu Gly His Leu 
1 5 



Glu Ala Ala Ala Gly 
10 



<210> 15 

<211> 51 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1) - - (51) 

<223> Beta-ketoacyl synthase 2 subdomain I 



<400> 15 

ggg cca age ctg gcg gtg gat ace gec tgt teg get teg etc ace gcg ' 48 
Gly Pro Ser Leu Ala Val Asp Thr Ala Cys Ser Ala Ser Leu Thr Ala 
15 10 15 




ate 
He 



51 



<210> 

<211> 17 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 16 

Gly Pro Ser Leu Ala Val Asp Thr Ala Cys Ser Ala Ser Leu Thr Ala 
1 5 10 15 



lie 



<210> 17 

<211> 30 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1)..(30) 

<223> Beta-ketoacyl synthase 2 subdomain II 



<400> 17 

ate gag gcg cat ggc ace ggc ace gta eta 30 
He Glu Ala His Gly Thr Gly Thr Val Leu 



-92- 



WO 02/024736 



PCT/AU01/01190 



10 



<210> 18 
<211> 10 
<212> 



PRT 

<213> Xanthomonas albilineans 



<400> 18 



He Glu Ala His Gly Thr Gly Thr Val Leu 
i 5 10 



<210> 


19 


<211> 


30 


<212> 


DMA 


<213> 


Xanthomonas albilineans 


<220> 




<221> 


CDS 


<222> 


(1) . . (30) 


<223> 


Beta-ketoacyl synthase 


<400> 


19 



cgt tct cct cgc eta acc ctg ccg ccc agg 30 
Arg Ser Pro Arg Leu Thr Leu Pro Pro Arg 
n 5 10 



<210> 20 
<211> 10 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 20 

Arg Ser Pro Arg Leu Thr Leu Pro Pro Arg 
15 10 



<210> 21 

<211> 93 

<212> DNA 

<213> Xanthomonas 



albilineans 



<220> 

<221> CDS 

<222> (1)..(93) 

<223> Beta-ketoacyl reductase domain 



<400> 21 

gtc tac gtc gtg ate ggc ggc get ggc ggc ttg ggt gaa gta ttg age 

Val Tyr Val Val He Gly Gly Ala Gly Gly Leu Gly Glu Val Leu Ser 

1 J 5 10 15 

gaa cac ttg ate cgc acg tac gac gcg cag ctg ate tgg ate ggg 

Glu His Leu He Arg Thr Tyr Asp Ala Gin Leu He Trp He Gly 
20 25 30 
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<210> 22 
<211> 31 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 22 

Val Tyr Val Val He Gly Gly Ala Gly Gly Leu Gly Glu Val Leu Ser 
15 io 15 



Glu His Leu He Arg Thr Tyr Asp Ala Gin Leu He Trp He Gly 
20 25 30 



<210> 23 

<211> 36 

<212> DNA 

<213> Xanthomonas albilineans 



<220> 
<221> 
<222> 
<223> 



CDS 

(1)..(36) 

Acyl protein carrier 1 domain 



<400> 23 

tgc gaa ctg gcg ctg gat teg etc caa tgc gtg cgt 
Cys Glu Leu Ala Leu Asp Ser Leu Gin Cys Val Arg 
5 10 



1 




<210> 


24 


<211> 


12 


<212> 


PRT 




Xcmt] 


<400> 


24 



Cys Glu Leu Ala Leu Asp Ser Leu Gin Cys Val Arg 



10 



<210> 


25 


<211> 


36 


<212> 


DNA 


<213> 


Xanthomonas albilineans 


<220> 




<221> 


CDS 


<222> 


(1) . . (36) 


<223> 


Acyl carrier protein 2 


<400> 


25 



gag tac tac ggt gtc gat teg ate gtg gcg ate gaa 
Glu Tyr Tyr Gly Val Asp Ser He Val Ala He Glu 
1 5 10 



<210> 26 
<211> 12 
<212> PRT 
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<213> Xanthomonas albilineans 
<400> 26 

Glu Tyr Tyr Gly Val Asp Ser lie Val Ala lie Glu 
15 10 



<210> 27 

<211> 36 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1)..(36) 

<223> Acyl carrier protein 3 domain 



<400> 27 

gag age tat ggt gtc gac tec ate gtc ate ate gaa 
Glu Ser Tyr Gly Val Asp Ser He Val He He Glu 
1 5 10 



<210> 28 
<211> 12 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 28 

Glu Ser Tyr Gly Val Asp Ser He Val He He Glu 
15 10 



<210> 29 

<211> 18 

<212> DMA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1) . - (18) 

<223> Adenylation domain subdomain I 



<400> 29 

tgg age tat gcg cag ttg 
Trp Ser Tyr Ala Gin Leu 
1 5 



<210> 30 

<211> 6 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 30 

Trp Ser Tyr Ala Gin Leu 
1 5 
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<210> 31 

<211> 33 

<212> DNA 

<213> Xanthomonas albilineans 



<220> 

<221> CDS 

<222> (1) . . (33) 

<223> Adenylation domain subdomain II 



<400> 31 

ttc aag gcc ggt gca tgc tat gtg ccg ate gat 
Phe Lys Ala Gly Ala Cys Tyr Val Pro lie Asp 
15 10 



<210> 32 
<211> 11 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 32 

Phe Lys Ala Gly Ala Cys Tyr Val Pro lie Asp 
15 10 



<210> 33 

<211> 48 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<22i> cDt; 

<222> (1).. (48) 

<223> Adenylation domain subdomain III 



<400> 33 

ctg gcg tgc gtg atg gtg acc tec ggc teg ace ggc egg ccc aag ggc 48 

Leu Ala Cys Val Met Val Thr Ser Gly Ser Thr Gly Arg Pro Lys Gly 
15 10 15 



<210> 34 
<211> 16 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 34 

Leu Ala Cys Val Met Val Thr Ser Gly Ser Thr Gly Arg Pro Lys Gly 
15 10 15 



<210> 35 

<211> 12 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 



-96- 



WO 02/024736 



CDS 

(1).. (12) 

Adenylation domain subdomain IV 



<221> 
<222> 
<223> 



PCT/AU01/01190 



<400> 35 
ttt gcg gtg teg 
Phe Ala Val Ser 
1 



<210> 36 

<211> 4 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 36 

Phe Ala Val Ser 
1 



<2io> 37 ; V //. V\ V. ;■• 

<211> -21 V : '' : '- 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1)..(21) 

<223> Adenylation domain subdomain V 



<400> 37 

aac aac tat ggc tgc acg gaa 

Asn. Asa Tyr Gly Cys Tnr Giu 
1 5 



<210> 38 

<211> 7 

<212> PRT 

< 2 1 3 > Xanthomonas albi 1 ineans 

<400> 38 

Asn Asn Tyr Gly Cys Thr Glu 
1 5 



<210> 39 

<211> 45 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1) . . (45) 

<223> Adenylation domain subdomain VI 



<400> 39 

ggc gag ctg cac gta cac age gtg ggg atg gcg cgc ggc tac tgg 
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Gly Glu Leu His Val His Ser Val Gly Met Ala Arg Gly Tyr Trp 
1 5 10 15 



<210> 40 
<211> 15 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 40 

Gly Glu Leu His Val His Ser Val Gly Met Ala Arg Gly Tyr Trp 
15 10 15 



<210> 41 

<211> 18 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1)..(18) 

<223> Adenylation domain subdomain VII 



<400> 41 

tac aag acc ggt gac atg 
Tyr Lys Thr Gly Asp Met 
1 5 



<210> 42 

<211> 6 

<212> PRT 

<213> Xanthomonac albilineans 

<400> 42 

Tyr Lys Thr Gly Asp Met 
1 5 



<210> 43 

<211> 60 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1)..(60) 

<223> Adenylation domain subdomain VIII 



<400> 43 

ggc cga cag gac ttc gag gtc aag gtg cgc ggc cac egg gtg gat acg 48 
Gly Arg Gin Asp Phe Glu Val Lys Val Arg Gly His Arg Val Asp Thr 
15 10 15 

egg cag gtg gag 60 
Arg Gin Val Glu 
20 
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<210> 44 
<211> 20 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 44 

Gly Arg Gin Asp Phe Glu Val Lys Val Arg Gly His Arg Val Asp Thr 
15 10 15 



Arg Gin Val Glu 
20 



<210> 45 

<211> 21 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1)..(21) 

<223> Adenylation domain subdomain IX 



<400> 45 

ate gcg cac ccg tat age gag 
He Ala His Pro Tyr Ser Glu 
1 5 



<210> 46 

<211> 7 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 46 

He Ala His Pro Tyr Ser Glu 
1 5 



<210> 47 

<211> 18 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1)..(18) 

<223> Adenylation domain subdomain X 



<400> 47 

aac ggc aag ttg gac egg 
Asn Gly Lys Leu Asp Arg 
l 5 



<210> 48 
<211> 6 
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<212> PRT 

<213> Xanthomonas albilineans 
<400> 48 

Asn Gly Lys Leu Asp Arg 
1 5 



<210> 49 

<211> 33 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1)..(33) 

<223> Peptidyl carrier protein 1 domain 



<400> 49 

atg gat ttc ttt gcc gtc ggc ggc cat teg gtg 
Met Asp Phe Phe Ala Val Gly Gly His Ser Val 
1 5 10 



<210> 50 

<211> 11 

<212> PRT 

<213> Xanthomonas 



albilineans 



<400> 50 

Met Asp Phe Phe Ala Val 'Gly Gly His Ser Val 
15 10 



<210> 51 

<211> 33 

<212> DNA 

<213> Xanthomonas albilineans 



<220> 

<221> CDS 

<222> (1) (33) 

<223> Peptidyl carrier proetin 2 domain 



<400> 51 

gac aac ttc ttc gcc ttg ggt ggg cac teg ctg 

Asp Asn Phe Phe Ala Leu Gly Gly His Ser Leu 
15 10 



<210> 52 

<211> 11 

<212> PRT 

<213> Xanthomonas 



albilineans 



<400> 52 



Asp Asn Phe Phe Ala Leu Gly Gly His Ser Leu 
15 10 
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<210> 53 

<211> 30 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1)..(30) 

<223> Condensation domain 1 subdomain I 



<400> 53 

act tat gca cag gag cgc eta tgg etc gtc 
Thr Tyx Ala Gin Glu Arg Leu Trp Leu Val 
1 5 10 



<210> 54 
<211> 10 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 54 

Thr Tyx Ala Gin Glu Arg Leu Trp Leu Val 
15 10 



<210> 55 

<211> 27 

<212> DNA 

<213> Xanthomonas albilineans 
<22C> 

<221> CDS 

<222> (1)..(27) 

<223> Condensation domain 1 subdomain II 



<400> 55 

egg cac gaa gtg ctg cgc aca cgc ttt 
Arg His Glu Val Leu Arg Thr Arg Phe 
1 5 



<210> 56 

<211> 9 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 56 

Arg His Glu Val Leu Arg Thr Arg Phe 

1 " 5 



<210> 57 

<211> 30 

<212> DNA 

<213> Xanthomonas 

<220> 



albilineans 
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<221> CDS 
<222> (1)..(30) 

<223> Condensation domain 1 subdomain III 



<400> 57 

ate cac cac ate att tec gac ggc tgg teg 

lie His His lie lie Ser Asp Gly Trp Ser 
15 10 



<210> 58 
<211> 10 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 58 

lie His His lie lie Ser Asp Gly Trp Ser 
15 10 



<210> 59 

<211> 21 

<212> DNA 

<213> Xanthomonas albilineans 

<220> 

<221> CDS 

<222> (1) . . (21) 

<223> Condensation domain 1 subdomain IV 



<400> 59 

tat gec gac tac gcg ctg tgg 
Tyr Ala Aap Tyr Ala Leu Trp 
1 5 



<210> 60 

<211> 7 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 60 

Tyr Ala Asp Tyr Ala Leu Trp 
1 5 



<210> 61 

<211> 36 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1)..(36) 

<223> Condensation domain 1 subdomain V 



<400> 61 

ate ggc ttt ttc ate aat att ttg ccg ctg egg gtg 36 
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He Gly Phe Phe He Asn He Leu Pro Leu Arg Val 



<210> 62 
<211> 12 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 62 

He Gly Phe Phe He Asn He Leu Pro Leu Arg Val 
15 10 



<210> 63 

<211> 21 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1)..(21) 

<223> Condensation domain 1 subdomain VI 



<400> 63 

gcg cac cag teg gtg ccg ttc 
Ala His Gin Ser Val Pro Phe 
1 5 



<210> 64 

<211> 7 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 64 

Ala His Gin Ser Val Pro Phe 
1 5 



<210> 65 

<211> 24 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1) . . (24) 

<223> Condensation domain 1 subdomain VII 



<400> 65 

cgc gac agt age cag ate ccg ctg 
Arg Asp Ser Ser Gin He Pro Leu 



1 



5 



10 



1 



5 



<210> 
<211> 
<212> 



66 
8 

PRT 
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<213> Xanthomonas albilineans 



<400> 


66 


Arg Asp Ser Ser Gin lie Pro Leu 


1 


5 


<210> 


67 


<211> 


30 


<212> 


DNA 


<213> 


Xanthomonas albilineans 


<220> 




<221> 


CDS 


<222> 


(1) ..(30) 


<223> 


Condensation domain 2 subdomain 



<400> 67 

teg ctg ttc cag gaa cgc ctg tgg ttc gtg 

Ser Leu Phe Gin Glu Arg Leu Trp Phe Val 
1 5 10 



<210> 


6B 




<211> 


10 




<212> 


PRT 




<213> 


Xanthomonas albilineans 




<400> 


68 




Ser Leu Phe Gin Glu Arg Leu Trp 


Phe Val 


1 


5 


10 


<210> 


69 




<211> 


27 




<212> 


DNA 




<213> 


Xanthomonas albilineans 




<220> 






<221> 


CDS 




<222> 


(l).-(27) 





<223> Condensation domain 2 subdomain II 



<400> 69 

cgc cac gaa ate ttg cgt acc cgc ttc 
Arg His Glu lie Leu Arg Thr Arg Phe 



1 


5 


<210> 


70 


<211> 


9 


<212> 


PRT 


<213> 


Xanthomonas 


<400> 


70 



Arg His Glu lie Leu Arg Thr Arg Phe 

1 " 5 
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<210> 71 

<211> 30 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1) . . (30) 

<223> Condensation domain 2 subdomain III 



<400> 71 

atg cat cac ctt ate tac gac get tgg teg 
Met His His Leu lie Tyr Asp Ala Trp Ser 
15 10 



<210> 72 

<211> 10 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 72 •;• 

Met His His Leu lie Tyr Asp Ala Trp Ser 
15 10 



<210> 


73 


<211> 


21 


<212> 


DNA 


<213> 


Xanthomonas albilineans 


<220> 




<221> 


CDS 


<222> 


(1)..(21) 


<223> 


Condensation domain 2 subdomain 


<400> 


73 



tat gee gac tat gcg ate tgg 
Tyr Ala Asp Tyr Ala He Trp 



<210> 74 

<211> 7 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 74 

Tyr Ala Asp Tyr Ala He Trp 
1 5 



<210> 75 

<211> 33 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 
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<221> CDS 
<222> (1)..(33) 

<223> Condensation domain 2 subdomain V 



<400> 75 

ate ggt ttc ttc ate aac ate ttg cct ttg cgc 33 

He Gly Phe Phe He Asn He Leu Pro Leu Arg 
15 10 



<210> 76 
<211> 11 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 76 

He Gly Phe Phe He Asn He Leu Pro Leu Arg 
15 10 



<210> 77 

<211> 21 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (I).. (21) 

<223> Condensation domain 2 subdomain VI 



<400> 77 

aac cag gcg eta ccg ttc gag 

Asn Gin Ala Lsu Pre Phe Glu 
1 5 

<210> 78 

<211> 7 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 78 

Asn Gin Ala Leu Pro Phe Glu 
1 5 



<210> 79 

<211> 24 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1)..(24) 

<223> Condensation domain 2 subdomain VII 



<400> 79 

cgt gac acc age egg att ccg eta 24 
Arg Asp Thr Ser Arg He Pro Leu 
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1 
1 




<210> 


80 


<211> 


8 


<212> 


PRT 


<213> 


Xanthomonas 


<400> 


80 



Arg Asp Thr Ser Arg He Pro Leu 



<210> 81 

<211> 242 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> promoter 

<222> (1) . . (242) 

<223> Bidirectional xabB promoter 



<400> 81 

catcacgcca cctccagcag ggtgtcatac acggccagcg gatgctgcag gttttccacl- 60 

ggcagggcca ctggctgtcg taagggaagc ggtgccttga gcgccggtgc ggacagtata 120 

acgacacgtt ccttggccaa gcgcactgtc ggcacggcct tgctgatgcc gcccatgtag 180 

ccgcgcgcct ggatctcgcg tagtagcacc acgctggccg ggatccatcg agggcgcgct 240 

tg 



<210> 


82 


<211> 


1200 


<212> 


DNA 


<213> 


Xanthomonas albilineans 


<220> 




<221> 


CDS 


<222> 


(149) . . (982) 


<223> 




<400> 


82 



242 



^v^w^www ^.^^^ gcccctcccc gaccccaagc atcgaccaa*- 60 

ggaccgaatg cggcgggtag gcgcgactct gcgacactag cgcaatgtta tcgtcgacat 120 

tgacgcccac agccctcagc gcaacgca atg ccc aat gcc gta ccg atg cag 172 

Met Pro Asn Ala Val Pro Met Gin 
1 5 

ggc gcg egg gga etc ccg cag ccg caa gcg atg aac cca ggg ttg ccg 220 
Gly Ala Arg Gly Leu Pro Gin Pro Gin Ala Met Asn Pro Gly Leu Pro 
10 15 20 

age gtc ggc ggc ttg age gca ggc cag cca ttg cag ttg teg tta gca 268 
Ser Val Gly Gly Leu Ser Ala Gly Gin Pro Leu Gin Leu Ser Leu Ala 
25 30 35 40 
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ccg gaa ctg cag gca gcc gcg cgc agt gcc cac cgc cat ctg etc gac 316 
Pro Glu Leu Gin Ala Ala Ala Arg Ser Ala His Arg His Leu Leu Asp 
45 50 55 

gac ggc acg gcg ctt tac ctg ctg gcg ttc gat acc gcg caa ttc gac 364 
Asp Gly Thx Ala Leu Tyr Leu Leu Ala Phe Asp Thr Ala Gin Phe Asp 
60 65 70 

ccg ggg get ttc gcg gca atg gca ate gcc cgc ccg gac age ate gcc 412 
Pro Gly Ala Phe Ala Ala Met Ala He Ala Arg Pro Asp Ser He Ala 
75 80 85 

cgc age gtg cgc aag cgt cag gcc gag ttc ctg ttc ggc cgt ctg gcc 460 
Arg Ser Val Arg Lys Arg Gin Ala Glu Phe Leu Phe Gly Arg Leu Ala 
90 95 100 

gcg cga ctg gcg ctg caa gag gtg ctg gga cct gcg caa gcg cag gca 508 
Ala Arg Leu Ala Leu Gin Glu Val Leu Gly Pro Ala Gin Ala Gin Ala 
105 HO H5 120 

gat att gca ate ggc gcg acg cgc gcg ccc tgc tgg cct gcc ggc age 556 
Asp He Ala He Gly Ala Thr Arg Ala Pro Cys Trp Pro Ala Gly Ser 
125 130 135 



ctg ggc age att tec cat tgc gag gac tac gcg gcc gcc ate gcc atg 
Leu Gly Ser He Ser His Cys Glu Asp Tyr Ala Ala Ala He Ala Met 
140 145 150 

gcg gcc ggc acc cgc cac ggc gtg ggc ate gat ctg gaa cga cca ate 
Ala Ala Gly Thr Arg His Gly Val Gly He Asp Leu Glu Arg Pro He 
155 160 165 

aca ccc gcg gcg cgc gcg gcg ttg ctg age ate gca ate gat gcc gac 
Thr Pro Ala Ala Arg Ala Ala Leu Leu Ser He Ala He Asp Ala Asp 
170 175 180 



gcg cgc eta cca ccg gac ctg gtg etc acc cac tac gcc tgg 
Ala Arg Leu Pro Pro Asp Leu Val Leu Thr His Tyr Ala Trp 
265 270 275 



604 



652 



700 



gaa gcc get cgt ctg gca aag gcg gca gac gcg cag tgg ccg caa gac 748 
Glu Ala Ala Arg Leu Ala Lys Ala Ala Asp Ala Gin Trp Pro Gin Asp 
185 190 195 200 

ctg ctg ctg acc gca eta ttt teg gcc aag gaa age ctg ttc aaa gcc 796 
Leu Leu Leu Thr Ala Leu Phe Ser Ala Lys Glu Ser Leu Phe Lys Ala 
205 210 215 

gcc tac age gcg gtc gga cgc tac ttc gac ttc age gcg gca cgc ctg 844 
Ala Tyr Ser Ala Val Gly Arg Tyr Phe Asp Phe Ser Ala Ala Arg Leu 
220 225 230 

tgc ggc ate gac ctg gca egg caa tgc ctg cat ctg cgc ctg acc gag 892 
Cys Gly He Asp Leu Ala Arg Gin Cys Leu His Leu Arg Leu Thr Glu 
235 240 245 

aca etc tgc gcg caa ttc gtg gcc ggg caa gtg tgc gag gtc ggc ttc 940 
Thr Leu Cys Ala Gin Phe Val Ala Gly Gin Val Cys Glu Val Gly Phe 
250 255 260 



982 



tgagcacgcg gacagtcgaa cccgccaacg ccaacggcac teaagaegtg gcgtgcgccg 1042 
cgtcggtcgt gaagctctcc ccgcagccgc acteggeggt ggcattggga ttgeggaaca 1102 
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cgaaggtctc acccaagccc tgcttggcga agtcgatttc ggtgccatcg accaactgca 1162 
gactggcggc atcgacataa atccgcactc cgtcctgc 1200 



<210> 83 
<211> 278 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 83 

Met Pro Asn Ala Val Pro Met Gin Gly Ala Arg Gly Leu Pro Gin Pro 
15 10 15 



Gin Ala Met Asn Pro Gly Leu Pro Ser Val Gly Gly Leu Ser Ala Gly 
20 25 30 



Gin Pro Leu Gin Leu Ser Leu Ala Pro Glu Leu Gin Ala Ala Ala Arg 
35 40 45 



Ser Ala His Arg His Leu Leu Asp Asp Gly Thr Ala Leu Tyr Leu Leu 
50 55 60 



Ala Phe Asp Thr Ala Gin Phe Asp Pro Gly Ala Phe Ala Ala Met Ala 
65 70 75 80 

He Ala Arg Pro Asp Ser He Ala Arg Ser Val Arg Lys Arg Gin Ala 
85 90 95 



Glu Phe Leu Phe Gly Arg Leu Ala Ala Arg Leu Ala Leu Gin Glu Val 
100 105 110 



Leu Gly Pro Ala Gin Ala Gin Ala Asp He Ala He Gly Ala Thr Arg 
115 120 125 



Ala Pro Cys Trp Pro Ala Gly Ser Leu Gly Ser He Ser His Cys Glu 
130 135 140 



Asp Tyx Ala Ala Ala He Ala Met Ala Ala Gly Thr Arg His Gly Val 
145 150 155 160 



Gly He Asp Leu Glu Arg Pro He Thr Pro Ala Ala Arg Ala Ala Leu 
165 170 175 



Leu Ser He Ala He Asp Ala Asp Glu Ala Ala Arg Leu Ala Lys Ala 
180 185 190 



Ala Asp Ala Gin Trp Pro Gin Asp Leu Leu Leu Thr Ala Leu Phe Ser 
195 200 205 
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Ala Lys Glu Ser Leu Phe Lys Ala Ala Tyr Ser Ala Val Gly Arg Tyr 
210 215 220 



Phe Asp Phe Ser Ala Ala Arg Leu Cys Gly lie Asp Leu Ala Arg Gin 
225 230 235 240 



Cys Leu His Leu Arg Leu Thr Glu Thr Leu Cys Ala Gin Phe Val Ala 
245 250 255 



Gly Gin Val Cys Glu Val Gly Phe Ala Arg Leu Pro Pro Asp Leu Val 
260 265 270 



Leu Thr His Tyr Ala Trp 
275 



<210> 84 

<211> 837 

<212> DNA 

<213> Xanthomonas albilineans 



c220> 
<221> 
<222> 
<223> 



CDS 
(1). 



(837) 



<400> 84 

atg ccc aat gcc gta ccg atg cag ggc gcg egg gga etc ccg cag ccg 
Met Pro Asn Ala Val Pro Met Gin Gly Ala Arg Gly Leu Pro Gin Pro 
15 10 15 



48 



caa gcg atg aac cca ggg ttg ccg age gtc ggc ggc ttg age gca ggc 
Gin Ala Met Asn Pro Gly Leu Pro Ser Val Gly Gly Leu Ser Ala Gly 
20 25 30 



96 



cag cca ttg cag ttg teg tta gca ccg gaa ctg cag gca gcc gcg cgc 
Gin Pro Leu Gin Leu Ser Leu Ala Pro Glu Leu Gin Ala Ala Ala Arg 
35 40 45 



144 



agt gcc cac cgc cat ctg etc gac gac ggc acg gcg ctt tac ctg ctg 
Ser Ala His Arg His Leu Leu Asp Asp Gly Thr Ala Leu Tyr Leu Leu 
50 55 60 



192 



gcg ttc gat acc gcg caa ttc gac ccg ggg get ttc gcg gca atg gca 
Ala Phe Asp Thr Ala Gin Phe Asp Pro Gly Ala Phe Ala Ala Met Ala 
65 70 75 80 



240 



ate gcc cgc ccg gac age ate gcc cgc age gtg cgc aag cgt cag gcc 
lie Ala Arg Pro Asp Ser lie Ala Arg Ser Val Arg Lys Arg Gin Ala 
85 90 95 



288 



gag ttc ctg ttc ggc cgt ctg gcc gcg cga ctg gcg ctg caa gag gtg 
Glu Phe Leu Phe Gly Arg Leu Ala Ala Arg Leu Ala Leu Gin Glu Val 
100 105 110 



336 



ctg gga cct gcg caa gcg cag gca gat att gca ate ggc gcg acg cgc 
Leu Gly Pro Ala Gin Ala Gin Ala Asp lie Ala lie Gly Ala Thr Arg 
115 120 125 



384 
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gcg 


ccc tgc 


tgg 


cct gcc 


ggc 


Ala 


Pro Cys 


Trp 


Pro Ala 


Gly 




130 






135 


gac 


tac gcg 


gcc 


gcc ate 


gcc 


Asp 


Tyr Ala 


Ala 


Ala lie 


Ala 


145 






150 




ggc 


ate gat 


ctg 


gaa cga 


cca 


Gly 


lie Asp 


Leu 


Glu Arg 


Pro 








165 




ctg 


age ate 


gca 


ate gat 


gcc 


Leu 


Ser lie 


Ala 


lie Asp 


Ala 






180 






gca 


gac gcg 


cag 


tgg ccg 


caa 


Ala 


Asp Ala 


Gin 


Trp Pro 


Gin 




195 








gcc 


aag gaa 


age 


ctg ttc 


aaa 


Ala 


Lys Glu 


Ser 


Leu Phe 


Lys 




210 






215 


ttc 


gac ttc 


age 


gcg gca 


cgc 


Phe 


Asp Phe 


Ser 


Ala Ala 


Arg 


225 






230 




tgc 


ctg cat 


ctg 


cgc ctg 


acc 


Cys 


Leu His 


Leu 


Arg Leu 


Thr 








245 




ggg 


caa gtg 


tgc 


gag gtc 


ggc 


Gly 


Gin Val 


Cys 


Glu Val 


Gly 






260 






etc 


acc cac 


tac 


gcc tgg 


tga 


Leu 


Thr His 


Tyr 


Ala Trp 





140 



155 160 



170 175 



185 190 



200 205 



220 



235 240 



250 255 



265 270 



275 



<210> 85 

<211> 278 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 85 



Met Pro Asn Ala Val Pro Met Gin Gly Ala Arg Gly Leu Pro Gin Pro 
1 5 10 15 



Gin Ala Met Asn Pro Gly Leu Pro Ser Val Gly Gly Leu Ser Ala Gly 
20 25 30 



Gin Pro Leu Gin Leu Ser Leu Ala Pro Glu Leu Gin Ala Ala Ala Arg 
35 40 45 



Ser Ala His Arg His Leu Leu Asp Asp Gly Thr Ala Leu Tyr Leu Leu 
50 55 60 



432 



480 



528 



576 



624 



672 



720 



768 



816 



837 
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Ala Phe Asp Thr Ala Gin Phe Asp Pro Gly Ala Phe Ala Ala Met Ala 
65 70 75 80 



He Ala Arg Pro Asp Ser He Ala Arg Ser Val Arg Lys Arg Gin Ala 
85 90 95 



Glu Phe Leu Phe Gly Arg Leu Ala Ala Arg Leu Ala Leu Gin Glu Val 
100 105 HO 



Leu Gly Pro Ala Gin Ala Gin Ala Asp He Ala He Gly Ala Thr Arg 
115 120 125 



Ala Pro Cys Trp Pro Ala Gly Ser Leu Gly Ser He Ser His Cys Glu 
130 i 135 140 



Asp Tyr Ala Ala Ala He Ala Met Ala Ala Gly Thr Arg His Gly Val 
145 150 155 160 



Gly He Asp Leu Glu Arg Pro He Thr Pro Ala Ala Arg Ala Ala Leu 
165 170 175 



Leu Ser He Ala He Asp Ala Asp Glu Ala Ala Arg Leu Ala Lys Ala 
180 185 190 



Ala Asp Ala Gin Trp Pro Gin Asp Leu Leu Leu Thr Ala Leu Phe Ser 
195 200 205 



Ala Lys Glu Ser Leu Phe Lys Ala Ala Tyr Ser Ala Val Gly Arg Tyr 
210 215 220 

Asp Phe Ser Ala Ala Arg Leu Cys Gly He Asp Leu Ala Arg Gin 
225 230 235 240 



Cys Leu His Leu Arg Leu Thr Glu Thr Leu Cys Ala Gin Phe Val Ala 
245 250 255 



Gly Gin Val Cys Glu Val Gly Phe Ala Arg Leu Pro Pro Asp Leu Val 
260 265 270 



Leu Thr His Tyr Ala Trp 
275 



<210> 86 

<211> 180 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 
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<222> (1)..(16B) 
<223> 



<400> 86 

ggc gtg ggc ate gat ctg gaa cga cca ate aca ccc gcg gcg cgc gcg 48 

Gly Val Gly He Asp Leu Glu Arg Pro He Thr Pro Ala Ala Arg Ala 

1-5 10 15 



gcg ttg ctg age ate gca ate gat gee gac gaa gee get cgt ctg gca 
Ala Leu Leu Ser He Ala He Asp Ala Asp Glu Ala Ala Arg Leu Ala 
20 25 30 



96 



aag gcg gca gac gcg cag tgg ccg caa gac ctg ctg ctg ace gca eta 144 
Lys Ala Ala Asp Ala Gin Trp Pro Gin Asp Leu Leu Leu Thr Ala Leu 
35 40 45 

ttt teg gee aag gaa age ctg ttc aaagccgcct ac 180 
Phe Ser Ala Lys Glu Ser Leu Phe 
50 55 



<210> 87 
<211> 56 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 87 

Gly Val Gly He Asp Leu Glu Arg Pro He Thr Pro Ala Ala Arg Ala 
15 10 15 



Ala Leu Leu Ser He Ala He Asp Ala Asp Glu Ala Ala Arg Leu Ala 
20 25 30 



Lys Ala Ala Asp Ala Gin Trp Pro Gin Asp Leu Leu Leu Thr Ala Leu 
35 40 45 



Phe Ser Ala Lys Glu Ser Leu Phe 
50 55 



<210> 


88 


<211> 


27 


<212> 


DNA 


<213> 


Xanthomonas 


<220> 




<221> 


CDS 


<222> 


(1) .. (27) 


<223> 




<400> 


88 



ggc gtg ggc ate gat ctg gaa cga cca 27 
Gly Val Gly He Asp Leu Glu Arg Pro 
1 5 



<210> 89 
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<211> 9 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 89 

Gly Val Gly lie Asp Leu Glu Arg Pro 
1 5 



<210> 90 

<211> 117 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1)..(117) 

<223> 



<400> 90 

ate aca ccc gcg gcg cgc gcg gcg ttg ctg age ate gca ate gat gee 48 
He Thr Pro Ala Ala Arg Ala Ala Leu Leu Ser He Ala He Asp Ala 
1 5 10 15 

gac gaa gee get cgt ctg gca aag gcg gca gac gcg cag tgg ccg caa 96 
Asp Glu Ala Ala Arg Leu Ala Lys Ala Ala Asp Ala Gin Trp Pro Gin 
20 25 30 

gac ctg ctg ctg acc gca eta 117 
Asp Leu Leu Leu Thr Ala Leu 
35 



<2io> yi 

<211> 39 

<212> PRT 

<213> XanthomonaB albilineans 

<400> 91 

He Thr Pro Ala Ala Arg Ala Ala Leu Leu Ser lie Ala He Asp Ala 
1 5 10 15 



Asp Glu Ala Ala Arg Leu Ala Lys Ala Ala Asp Ala Gin Trp Pro Gin 
20 25 30 



Asp Leu Leu Leu Thr Ala Leu 
35 



<210> 92 

<211> 36 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1)..(36) 
<223> 
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<400> 92 

ttt teg gec aag gaa age ctg ttc aaa gec gec tac 36 
Phe Ser Ala Lys Glu Ser Leu Phe Lys Ala Ala Tyr 
1 5 10 



<210> 93 
<211> 12 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 93 

Phe Ser Ala Lys Glu Ser Leu Phe Lys Ala Ala Tyr 
15 10 



<210> 94 

<211> 1515 

<212> DNA 

<213> Xanthomonas albilineans .'A; f 

■ ':;/'• r.yv/, r.. : ;.v •'•VY: 

<220> 

<221> CDS 

<222> (46).. (1071) 

<223> 



<400> 94 

caaaagcegg ccgccgtcac ccgttcatcg atagegaggg caatc atg gat tea gcg 57 

Met Asp Ser Ala 
1 

tta cct aca tct gca ttt ace ttc gat etc ttt tac ace acg gtt aac - 105 
Leu Pro Thr Ser Ala Phe Thr Phe Asp Leu Phe Tyr Thr Thr Val Asn 
5 10 15 20 

gee tac tat cgc act gee gca gtc aag gcg gcg ate gaa ctg ggg eta .153 
Ala Tyr Tyr Arg Thr Ala Ala Val Lys Ala Ala He Glu Leu Gly Leu 
25 30 35 

ttc gat gtg gtg ggg cag cag ggc cga act ccc gca gec ate gec gag 201 
Phe Asp Val Val Gly Gin Gin Gly Arg Thr Pro Ala Ala He Ala Glu 
40 45 50 

gec tgc cag gcg teg ccg cgc ggc att cgc ate ctt tgc tat tac eta 249 
Ala Cys Gin Ala Ser Pro Arg Gly He Arg He Leu Cys Tyr Tyr Leu 
55 60 65 

gta teg ate ggt ttt eta cgc cgc aac ggt ggc ctg ttc tac ata gat 297 
Val Ser He Gly Phe Leu Arg Arg Asn Gly Gly Leu Phe Tyr lie Asp 
70 75 80 

cgc aac atg gec atg tac ctg gat cgt agt teg ccc ggc tac ctg ggt 345 
Arg Asn Met Ala Met Tyr Leu Asp Arg Ser Ser Pro Gly Tyr Leu Gly 
85 90 95 100 

ggc age ate aag ttc ctg etc teg ccc tac ate atg age gee ttc ace 393 
Gly Ser He Lys Phe Leu Leu Ser Pro Tyr He Met Ser Ala Phe Thr 
105 110 115 
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gat ctg acc gcc gta gtc agg acc ggc aag ate aac ctg gcg cag gac 441 
Asp Leu Thr Ala Val Val Arg Thr Gly Lys lie Asn Leu Ala Gin Asp 
120 125 130 

ggc gtg gtg gca ccg gat cac ccg cag tgg gtg gaa ttt gca cgc gcg 489 
Gly Val Val Ala Pro Asp Hie Pro Gin Trp Val Glu Phe Ala Arg Ala 
135 140 145 

atg gca ccg atg atg gcg ctg ccc teg gcg ttg ate gcc aat atg gtg 537 
Met Ala Pro Met Met Ala Leu Pro Ser Ala Leu He Ala Asn Met Val 
150 155 160 

teg ttg ccc get gat egg ccg att cgt gtg ctg gac gtg gca gcc ggc 585 
Ser Leu Pro Ala Asp Arg Pro He Arg Val Leu Asp Val Ala Ala Gly 
165 170 175 180 

cac ggc ctg ttc ggc ate gcc ttc gcg cag cgc ttc cgc cag get gaa 633 
His Gly Leu Phe Gly He Ala Phe Ala Gin Arg Phe Arg Gin Ala Glu 
185 190 195 

gtg age ttc ctg gac tgg gac aac gtg eta gac gta gca cgc gaa aac 681 
Val Ser Phe Leu Asp Trp Asp Asn Val Leu Asp Val Ala Arg Glu Asn 
200 205 210 

gcc cag gcg gcc aaa gtg gcc gag cga gcg cgt ttc ctg ccc ggc aac 729 
Ala Gin Ala Ala Lys Val Ala Glu Arg Ala Arg Phe Leu Pro Gly Asn 
215 220 225 

gca ttc gac etc gat tac ggc age ggc tac gac gtg ate ttg ttg acc 777 
Ala Phe Asp Leu Asp Tyr Gly Ser Gly Tyr Asp Val He Leu Leu Thr 
230 235 240 

aac ttc ctg cac cat ttc gat gag gtc gat ggc gag cgc ate ttg get 825 
Asn Phe Leu His His Phe Asp Glu Val Asp Gly Glu Arg He Leu Ala 
245 250 255 260 

aag acg cgc gat gcg ctg aac gac gac ggc atg gtg ate act ttc gaa 873 
Lys Thr Arg Asp Ala Leu Asn Asp Asp Gly Met Val He Thr Phe Glu 
265 270 275 

ttc ate gcc gac gaa gag cgt tec tea ccg ccg ctg gcc gcc acc ttc 921 
Phe He Ala Asp Glu Glu Arg Ser Ser Pro Pro Leu Ala Ala Thr Phe 
280 285 290 

age atg atg atg ctg ggc acc acc ccg gcg ggc gag tec tac acc tat 969 
Ser Met Met Met Leu Gly Thr Thr Pro Ala Gly Glu Ser Tyr Thr Tyr 
295 300 305 

age gat ctg gaa agg atg ttt egg cat gcc ggc ttc ggc cac gtg gaa 1017 
Ser Asp Leu Glu Arg Met Phe Arg His Ala Gly Phe Gly His Val Glu 
310 315 320 

eta aaa teg ata ccg ccg gcc ttg ctg aaa gtg gtg gtt tec cgc aag 1065 
Leu Lys Ser He Pro Pro Ala Leu Leu Lys Val Val Val Ser Arg Lys 
325 330 335 340 

agg gcc ccataatgat egaateggeg acatcccctg tggegaaaac cgagcgcatc 1121 
Arg Ala 

tggtgcaccg agctggacct ggatgeaetc aacgecatgt cggccaacac gatgeaggee 1181 
ctgctcggta tacgeatgat egagategge teggactate tggtctcctg catgtcggtg 1241 
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gactggcgtt 


gccaccagcc 


ctatggggta 


ttgcatggcg gcgcatcggt caccctggcc 


1301 


gaggctaccg 


gcagcatggc 


ggcctccatg 


tgcgtgccgg ccggccaacg ttgcgttggc 


1361 


ctagacatca 


atgccaacca 


catcgcgagc 


atctccagtg gccaagtaca gtgcatcgcj 


1421 


cggccgctgc 


acataggggc 


cttgacccag 


gtatggcaga tgcgcatcta tgacgaaggt 


14 81 


gaccgcacga 


tctgcgtgtc 


gcgcctgacc 


atgg 


1515 



<210> 95 
<211> 342 
<212> PRT 

<213> Xanthomonas albilineans 
<400> 95 

Met Asp Ser Ala Leu Pro Thr Ser Ala Phe Thr Phe Asp Leu Phe Tyr 
15 10 15 



Thr Thr Val Asn Ala Tyr Tyr Arg Thr Ala Ala Val Lys Ala Ala He 
20 25 30 




Glu Leu Gly Leu Phe Asp Val Val Gly Gin Gin Gly Arg Thr Pro Ala 
35 40 45 



Ala He Ala Glu Ala Cys Gin Ala Ser Pro Arg Gly He Arg He Leu 
50 55 60 



Cys Tyr Tyr Leu Val Ser He Gly Phe Leu Arg Arg Asn Gly G3y Leu 

65 70 75 



Phe Tyr He Asp Arg Asn Met Ala Met Tyr Leu Asp Arg Ser Ser Pro 
85 90 95 



Gly Tyr Leu Gly Gly Ser He Lys Phe Leu Leu Ser Pro Tyr He Met 
100 105 110 



Ser Ala Phe Thr Asp Leu Thr Ala Val Val Arg Thr Gly Lys He Asn 
115 120 125 



Leu Ala Gin Asp Gly Val Val Ala Pro Asp Hie Pro Gin Trp Val Glu 
130 135 140 



Phe Ala Arg Ala Met Ala Pro Met Met Ala Leu Pro Ser Ala Leu He 
145 150 155 160 



Ala Asn Met Val Ser Leu Pro Ala Asp Arg Pro He Arg Val Leu Asp 
165 170 175 



Val Ala Ala Gly His Gly Leu Phe Gly He Ala Phe Ala Gin Arg Phe 
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180 185 190 



Arg Gin Ala Glu Val Ser Phe Leu Asp Trp Asp Asn Val Leu Asp Val 
195 200 205 



Ala Arg Glu Asn Ala Gin Ala Ala Lys Val Ala Glu Arg Ala Arg Phe 
210 215 220 



Leu Pro Gly Asn Ala Phe Asp Leu Asp Tyr Gly Ser Gly Tyr Asp Val 
225 230 235 240 

He Leu Leu Thr Asn Phe Leu His His Phe Asp Glu Val Asp Gly Glu 
245 250 255 



Arg He Leu Ala Lys Thr Arg Asp Ala Leu Asn Asp Asp Gly Met Val 
260 265 270 



He Thr Phe Glu Phe He Ala Asp Glu Glu Arg Ser Ser Pro Pro Leu 
275 280 285 



Ala Ala Thr Phe Ser Met Met Met Leu Gly Thr Thr Pro Ala Gly Glu 
290 295 300 



Ser Tyr Thr Tyr Ser Asp Leu Glu Arg Met Phe Arg His Ala Gly Phe 
305 310 315 320 



Gly His Val Glu Leu Lys Ser He Pro Pro Ala Leu Leu Lys Val Val 

325 330 325 



Val Ser Arg Lys Arg Ala 





340 


<210> 


96 


<211> 


1032 


<212> 


DNA 


<213> 


Xanthomonas 


<220> 




<221> 


CDS 


<222> 


(1) . . (1029) 


<223> 




<400> 


96 



atg gat tea gcg tta cct aca tct gca ttt acc ttc gat etc ttt tac 48 

Met Asp Ser Ala Leu Pro Thr Ser Ala Phe Thr Phe Asp Leu Phe Tyr 

15 10 15 

acc acg gtt aac gec tac tat cgc act gec gca gtc aag gcg gcg ate 96 

Thr Thr Val Asn Ala Tyr Tyr Arg Thr Ala Ala Val Lys Ala Ala He 
20 25 30 

gaa ctg ggg eta ttc gat gtg gtg ggg cag cag ggc cga act ccc gca 144 
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Glu Leu Gly Leu Phe Asp Val Val Gly Gin Gin Gly Arg Thr Pro Ala 
35 40 45 

gcc ate gcc gag gec tgc cag gcg teg ccg cgc ggc att cgc ate ctt 192 
Ala He Ala Glu Ala Cys Gin Ala Ser Pro Arg Gly He Arg He Leu 
50 55 60 

tgc tat tac eta gta teg ate ggt ttt eta cgc cgc aac ggt ggc ctg 240 
Cys Tyr Tyr Leu Val Ser He Gly Phe Leu Arg Arg Asn Gly Gly Leu 
65 70 75 80 

ttc tac ata gat cgc aac atg gcc atg tac ctg gat cgt agt teg ccc 288 
Phe Tyr He Asp Arg Asn Met Ala Met Tyr Leu Asp Arg Ser Ser Pro 
85 90 95 



ggc tac ctg ggt ggc age ate aag ttc ctg etc teg ccc tac ate atg 
Gly Tyr Leu Gly Gly Ser lie Lys Phe Leu Leu Ser Pro Tyr He Met 
100 105 HO 



ctg ccc ggc aac gca ttc gac etc gat tac ggc age ggc tac gac gtg 
Leu Pro Gly Asn Ala Phe Asp Leu Asp Tyr Gly Ser Gly Tyr Asp Val 
225 230 235 240 



336 



528 



age gcc ttc ace gat ctg ace gcc gta gtc agg ace ggc aag ate aac 384 
Ser Ala Phe Thr Asp Leu Thr Ala Val Val Arg Thr Gly Lys He Asn 
115 120 125 

ctg gcg cag gac ggc gtg gtg gca ccg gat cac ccg cag tgg gtg gaa 432 
Leu Ala Gin Asp Gly Val Val Ala Pro Asp Hib Pro Gin Trp Val Glu 
130 135 140 

ttt gca cgc gcg atg gca ccg atg atg gcg ctg ccc teg gcg ttg ate 480 
Phe Ala Arg Ala Met Ala Pro Met Met Ala Leu Pro Ser Ala Leu He 
145 150 155 160 

gcc aat atg gtg teg ttg ccc get gat egg ccg att cgt gtg ctg gac 
Ala Asn Met Val Ser Leu Pro Ala Asp Arg Pro He Arg Val Leu Asp 
165 170 175 

gtg gca gcc ggc cac ggc ctg ttc ggc ate gcc ttc gcg cag cgc ttc ±76 
Val Ala Ala Gly His Gly Leu Phe Gly He Ala Phe Ala Gin Arg Phe 
180 185 190 

cgc cag get gaa gtg age ttc ctg gac tgg gac aac gtg eta gac gta 624 
Arg Gin Ala Glu Val Ser Phe Leu Asp Trp Asp Asn Val Leu Asp Val 
195 200 205 

gca cgc gaa aac gcc cag gcg gcc aaa gtg gcc gag cga gcg cgt ttc 672 
Ala Arg Glu Asn Ala Gin Ala Ala Lys Val Ala Glu Arg Ala Arg Phe 
210 215 220 



720 



ate ttg ttg acc aac ttc ctg cac cat ttc gat gag gtc gat ggc gag 768 
He Leu Leu Thr Asn Phe Leu His His Phe Asp Glu Val Asp Gly Glu 
245 250 255 

cgc ate ttg get aag acg cgc gat gcg ctg aac gac gac ggc atg gtg 816 
Arg He Leu Ala Lys Thr Arg Asp Ala Leu Asn Asp Asp Gly Met Val 
260 265 270 

ate act ttc gaa ttc ate gcc gac gaa gag cgt tec tea ccg ccg ctg 864 
He Thr Phe Glu Phe He Ala Asp Glu Glu Arg Ser Ser Pro Pro Leu 
275 280 285 
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gcc gcc acc ttc age atg atg atg ctg ggc acc acc ccg gcg ggc gag 912 

Ala Ala Thr Phe Ser Met Met Met Leu Gly Thr Thr Pro Ala Gly Glu 
290 295 300 

tec tac acc tat age gat ctg gaa agg atg ttt egg cat gcc ggc ttc 960 

Ser Tyr Thr Tyr Ser Asp Leu Glu Arg Met Phe Arg His Ala Gly Phe 
305 310 315 320 



ggc cac gtg gaa eta aaa teg ata ccg ccg gcc ttg ctg aaa gtg gtg 
Gly His Val Glu Leu Lys Ser lie Pro Pro Ala Leu Leu Lys Val Val 
325 330 335 



Leu Ala Gin Asp Gly Val Val Ala Pro Asp His Pro Gin Trp Val Glu 
130 135 140 



Phe Ala Arg Ala Met Ala Pro Met Met Ala Leu Pro Ser Ala Leu lie 
145 ~ 150 155 160 



1008 



gtt tec cgc aag agg gcc cca taa 1Q 32 
Val Ser Arg Lys Arg Ala Pro 
340 



<210> 97 

<211> 343 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 97 ; r \ iV' , 

Met Asp Ser Ala Leu Pro Thr Ser Ala Phe Thr Phe Asp Leu Phe Tyr 
! 5 10 15 



Thr Thr Val Asn Ala Tyr Tyr Arg Thr Ala Ala Val Lys Ala Ala lie 
20 25 30 



Glu Leu Gly Leu Phe Asp Val Val Gly Gin Gin Gly Arg Thr Pro Ala 
35 40 45 



Ala lie Ala Glu Ala Cys Gin Ala Ser Pro Arg Gly lie Arg lie Leu 
50 55 60 



Cys Tyr Tyr Leu Val Ser lie Gly Phe Leu Arg Arg Asn Gly Gly Leu 
65 70 75 80 



Phe Tyr lie Asp Arg Asn Met Ala Met Tyr Leu Asp Arg Ser Ser Pro 
85 90 95 



Gly Tyr Leu Gly Gly Ser lie Lys Phe Leu Leu Ser Pro Tyr lie Met 
100 105 HO 



Ser Ala Phe Thr Asp Leu Thr Ala Val Val Arg Thr Gly Lys He Asn 
X15 120 125 
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Ala Asn Met Val Ser Leu Pro Ala Asp Arg Pro He Arg Val Leu Asp 
165 170 175 



Val Ala Ala Gly His Gly Leu Phe Gly He Ala Phe Ala Gin Arg Phe 
180 185 1^0 

Arg Gin Ala Glu Val Ser Phe Leu Asp Trp Asp Asn Val Leu Asp Val 

195 200 205 



Ala Arg Glu Asn Ala Gin Ala Ala Lys Val Ala Glu Arg Ala Arg Phe 
210 215 220 



Leu Pro Gly Asn Ala Phe Asp Leu Asp Tyr Gly Ser Gly Tyr Asp Val 
225 230 235 240 



He Leu Leu Thr Asn Phe Leu His His Phe Asp Glu Val Asp Gly Glu 
245 250 255 



Arg He Leu Ala Lys Thr Arg Asp Ala Leu Asn Asp Asp Gly Met Val 
260 265 270 



He Thr Phe Glu Phe He Ala Asp Glu Glu Arg Ser Ser Pro Pro Leu 
275 280 285 



Ala Ala Thr Phe Ser Met Met Met Leu Gly Thr Thr Pro Ala Gly Glu 
290 295 300 



Ser Tyr Thr Tyr Ser Asp Leu Glu Arg Met Phe Arg His Ala Gly Phe 
305 310 315 320 



Gly His Val Glu Leu Lys Ser He Pro Pro Ala Leu Leu Lys Val Val 
325 330 335 



Val Ser Arg Lys Arg Ala Pro 
340 



<210> 98 

<211> 21 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1)..(21) 

<223> Motif I 



<400> 98 

gtg ctg gac gtg gca gcc ggc 

Val Leu Asp Val Ala Ala Gly 
1 5 
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<210> 99 

<211> 7 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 99 

Val Leu Asp Val Ala Ala Gly 
1 5 



<210> 100 

<211> 24 

<212> DNA 

<213> Xanthomonae 



albilineans 



<220> 

<221> CDS 

<222> (1)..(24) 

<223> Motif II 



<400> 100 
age ggc tac gac 
Ser Gly Tyr Asp 
1 



gtg ate ttg ttg 
Val lie Leu Leu 
5 



<210> 101 

<211> 8 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 101 

Ser Gly Tyr Asp Val lie Leu Leu 
1 5 



<210> 102 

<211> 27 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1)..(27) 

<223> Motif III 



<400> 102 

ctg aac gac gac ggc atg gtg ate act 
Leu Asn Asp Asp Gly Met Val lie Thr 
1 5 



<210> 103 

<211> 9 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 103 
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Leu Asn Asp Asp Gly Met Val He Thr 
1 5 

<2X0> 104 

<211> 303 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1)..(303) 

<223> 



<400> 104 

gtg ctg gac gtg gca gcc ggc cac ggc ctg ttc ggc ate gec ttc gcg 46 

Val Leu Asp Val Ala Ala Gly His Gly Leu Phe Gly He Ala Phe Ala 

! 5 10 15 



cag cgc ttc cgc cag get gaa gtg age ttc ctg gac tgg gac aac gtg 
Gin Arg Phe Arg Gin Ala Glu Val Ser Phe Leu Asp Trp Asp Asn Val 
20 25 30 



96 



eta gac gta gca cgc gaa aac gcc cag gcg gcc aaa gtg gcc gag cga 144 
Leu Asp Val Ala Arg Glu Asn Ala Gin Ala Ala Lys Val Ala Glu Arg 
35 40 45 

gcg cgt ttc ctg ccc ggc aac gca ttc gac etc gat tac ggc age ggc 192 
Ala Arg Phe Leu Pro Gly Asn Ala Phe Asp Leu Asp Tyr Gly Ser Gly 
50 55 60 

tac gac gtg ate ttg ttg ace aac ttc ctg cac cat ttc gat gag gtc 240 
Tyr Asp Val He Leu Leu Thr Asn Phe Leu His His Phe Asp Glu Val 
65 70 75 80 

gat ggc gag cgc ate ttg get aag acg cgc gat gcg ctg aac gac gac 288 
Asp Gly Glu Arg He Leu Ala Lys Thr Arg Asp Ala Leu Asn Asp Asp 
85 90 95 

ggc atg gtg ate act 303 
Gly Met Val He Thr 
100 



<210> 105 

<211> 101 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 105 

Val Leu Asp Val Ala Ala Gly His Gly Leu Phe Gly He Ala Phe Ala 
1 5 10 15 



Gin Arg Phe Arg Gin Ala Glu Val Ser Phe Leu Asp Trp Asp Asn Val 
20 25 30 



Leu Asp Val Ala Arg Glu Asn Ala Gin Ala Ala Lys Val Ala Glu Arg 
35 40 45 
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Ala Arg Phe Leu Pro Gly Asn Ala Phe Asp Leu Asp Tyr Gly Ser Gly 
50 55 60 



Tyr Asp Val He Leu Leu Thr Asn Phe Leu His His Phe Asp Glu Val 
65 70 75 80 



Asp Gly Glu Arg He Leu Ala Lys Thr Arg Asp Ala Leu Asn Asp Asp 
85 90 95 



Gly Met Val He Thr 
100 



<210> 106 

<211> 831 

<212> DNA 

<213> Xanthomonas albilineans 
<220> 

<221> CDS 

<222> (1)..(831) 

<223> 



<400> 106 

atg gat tea gcg tta cct aca tct gca ttt acc ttc gat etc ttt tac 48 

Met Asp Ser Ala Leu Pro Thr Ser Ala Phe Thr Phe Asp Leu Phe Tyr 
15 10 15 

acc acg gtt aac gee tac tat cgc act gee gca gtc aag gcg gcg ate 96 
Thr Thr val Asn Ala Tyr Tyr Arg Thr Ala Ala Val Lys Ala Ala He 
20 25 30 

gaa ctg ggg eta ttc gat gtg gtg ggg cag cag ggc cga act ccc gca 144 
Glu Leu Gly Leu Phe Asp Val Val Gly Gin Gin Gly Arg Thr Pro Ala 
35 40 45 

gee ate gec gag gec tgc cag gcg teg ccg cgc ggc att cgc ate ctt 192 
Ala He Ala Glu Ala Cys Gin Ala Ser Pro Arg Gly He Arg He Leu 
50 55 60 

tgc tat tac eta gta teg ate ggt ttt eta cgc cgc aac ggt ggc ctg 240 
Cys Tyr Tyr Leu Val Ser He Gly Phe Leu Arg Arg Asn Gly Gly Leu 
65 70 75 80 



ttc tac ata gat cgc aac atg gee atg tac ctg gat cgt agt teg ccc 
Phe Tyr He Asp Arg Asn Met Ala Met Tyr Leu Asp Arg Ser Ser Pro 
85 90 95 



288 



ggc tac ctg ggt ggc age ate aag ttc ctg etc teg ccc tac ate atg 336 
Gly Tyr Leu Gly Gly Ser He Lys Phe Leu Leu Ser Pro Tyr He Met 
100 105 110 

age gee ttc acc gat ctg acc gee gta gtc agg acc ggc aag ate aac 384 
Ser Ala Phe Thr Asp Leu Thr Ala Val Val Arg Thr Gly Lys He Asn 
115 120 125 

ctg gcg cag gac ggc gtg gtg gca ccg gat cac ccg cag tgg gtg gaa 432 
Leu Ala Gin Asp Gly Val Val Ala Pro Asp His Pro Gin Trp Val Glu 
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130 135 140 

ttt gca cgc gcg atg gca ccg atg atg gcg ctg ccc teg gcg ttg ate 480 
Phe Ala Arg Ala Met Ala Pro Met Met Ala Leu Pro Ser Ala Leu He 
145 150 155 160 

gec aat atg gtg teg ttg ccc get gat egg ccg att cgt gtg ctg gac 528 
Ala Asn Met Val Ser Leu Pro Ala Asp Arg Pro He Arg Val Leu Asp 
165 170 175 

gtg gca gec ggc cac ggc ctg ttc ggc ate gec ttc gcg cag cgc ttc 576 
Val Ala Ala Gly His Gly Leu Phe Gly He Ala Phe Ala Gin Arg Phe 
180 185 190 

cgc cag get gaa gtg age ttc ctg gac tgg gac aac gtg eta gac gta 624 
Arg Gin Ala Glu Val Ser Phe Leu Asp Trp Asp Asn Val Leu Asp Val 
195 200 205 

gca cgc gaa aac gee cag gcg gec aaa gtg gee gag cga gcg cgt ttc 672 
Ala Arg Glu Asn Ala Gin Ala Ala Lys Val Ala Glu Arg Ala Arg Phe 
210 215 220 

ctg ccc ggc aac gca ttc gac etc gat tac ggc age ggc tac gac gtg 720 
Leu Pro Gly Asn Ala Phe Asp Leu Asp Tyr Gly Ser Gly Tyr Asp Val 
225 230 235 240 

ate ttg ttg acc aac ttc ctg cac cat ttc gat gag gtc gat ggc gag 768 
He Leu Leu Thr Asn Phe Leu His His Phe Asp Glu Val Asp Gly Glu 
245 250 255 

cgc ate ttg get aag acg cgc gat gcg ctg aac gac gac ggc atg gtg 816 
Arg He Leu Ala Lys Thr Arg Asp Ala Leu Asn Asp Asp Gly Met Val 
260 265 270 



ate act ttc gaa ttc 
He Thr Phe Glu Phe 
275 



831 



<210> 107 

<211> 277 

<212> PRT 

<213> Xanthomonas albilineans 

<400> 107 

Met Asp Ser Ala Leu Pro Thr Ser Ala Phe Thr Phe Asp Leu Phe Tyr 
1 5 10 15 



Thr Thr Val Asn Ala Tyr Tyr Arg Thr Ala Ala Val Lys Ala Ala He 
20 25 30 



Glu Leu Gly Leu Phe Asp Val Val Gly Gin Gin Gly Arg Thr Pro Ala 
35 40 45 



Ala He Ala Glu Ala Cys Gin Ala Ser Pro Arg Gly He Arg He Leu 
50 55 60 

Cys Tyr Tyr Leu Val Ser He Gly Phe Leu Arg Arg Asn Gly Gly Leu 
65 70 75 80 
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Phe Tyr He Asp Axg Asn Met Ala Met Tyr Leu Asp Arg Ser Ser Pro 
85 90 95 



Gly Tyr Leu Gly Gly Ser He Lys Phe Leu Leu Ser Pro Tyr He Met 
100 105 110 



Ser Ala Phe Thr Asp Leu Thr Ala Val Val Arg Thr Gly Lys He Asn 
115 120 125 



Leu Ala Gin Asp Gly Val Val Ala Pro Asp His Pro Gin Trp Val Glu 
130 135 140 



Phe Ala Arg Ala Met Ala Pro Met Met Ala Leu Pro Ser Ala Leu He 
145 150 155 160 



Ala Asn Met Val Ser Leu Pro Ala Asp Arg Pro He Arg Val Leu Asp 
165 170 175 



Val Ala Ala Gly His Gly Leu Phe Gly He Ala Phe Ala Gin Arg Phe 
180 185 190 



Arg Gin Ala Glu Val Ser Phe Leu Asp Trp Asp Asn Val Leu Asp Val 
195 200 205 



Ala Arg Glu Asn Ala Gin Ala Ala Lys Val Ala Glu Arg Ala Arg Phe 

21C 215 22U 



Leu Pro Gly Asn Ala Phe Asp Leu Asp Tyr Gly Ser Gly Tyr Asp Val 
225 230 235 240 



He Leu Leu Thr Asn Phe Leu His His Phe Asp Glu Val Asp Gly Glu 
245 250 255 



Arg He Leu Ala Lys Thr Arg Asp Ala Leu Asn Asp Asp Gly Met Val 
260 265 270 



He Thr Phe Glu Phe 
275 
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